Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor sync script #175

Merged
merged 21 commits into from
Nov 13, 2024
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .coderabbit.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,7 @@ early_access: true
reviews:
path_filters:
- "!content/blog/!*.md"
- "!content/blog/sync_status.yml"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's avoid to store data files from content, it should be located in data folder as I remember, please check hugo docs about this

- "!test/fixtures/"
- "!wp-content/**"
- "!wp-includes/**"
4 changes: 2 additions & 2 deletions .github/workflows/sync-and-publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,9 @@ jobs:
run: |
if [ "${{ github.event.inputs.force }}" = "true" ]
then
bin/from_devto -f
bin/sync_with_devto -f
else
bin/from_devto
bin/sync_with_devto
fi

bin/upload_assets_to_github
Expand Down
170 changes: 0 additions & 170 deletions bin/from_devto

This file was deleted.

49 changes: 49 additions & 0 deletions bin/sync/article_cleaner.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
require 'fileutils'
require 'yaml'

module ArticleCleaner
SYNC_STATUS_FILE = 'sync_status.yml'.freeze
ARTICLE_FILE = 'index.md'.freeze

def cleanup_renamed_articles
raise ArgumentError, "Working directory doesn't exist" unless Dir.exist?(working_dir)

deleted_folders = []
slugs = load_slugs_from_yaml

Dir.glob("#{working_dir}/*").each do |folder_path|
next unless File.directory?(folder_path) && File.exist?("#{folder_path}/#{ARTICLE_FILE}")

folder_name = File.basename(folder_path)
unless slugs.include?(folder_name)
begin
FileUtils.rm_rf(folder_path)
deleted_folders << folder_name
puts "Deleted folder: #{folder_name}"
rescue StandardError => e
puts "Failed to delete folder #{folder_name}: #{e.message}"
end
end
end
deleted_folders
end
dgorodnichy marked this conversation as resolved.
Show resolved Hide resolved

private

def load_slugs_from_yaml
yaml_path = File.join(working_dir, SYNC_STATUS_FILE)

begin
yaml_data = YAML.load_file(yaml_path)
raise "Invalid YAML structure" unless yaml_data.is_a?(Hash)

yaml_data.values.map do |article|
raise "Invalid article data structure" unless article.is_a?(Hash) && article[:slug]
article[:slug]
end
rescue StandardError => e
logger.error "Failed to load slugs from YAML: #{e.message}"
[]
end
end
dgorodnichy marked this conversation as resolved.
Show resolved Hide resolved
end
61 changes: 61 additions & 0 deletions bin/sync/article_sync_checker.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
require 'json'

module ArticleSyncChecker
USERNAME = 'jetthoughts'.freeze
dgorodnichy marked this conversation as resolved.
Show resolved Hide resolved
SYNC_STATUS_FILE = 'sync_status.yml'.freeze
USELESS_WORDS = %w[and the a but to is so].freeze
dgorodnichy marked this conversation as resolved.
Show resolved Hide resolved

dgorodnichy marked this conversation as resolved.
Show resolved Hide resolved
def update_sync_status
ensure_sync_status_file_exists
@sync_status = sync_status
dgorodnichy marked this conversation as resolved.
Show resolved Hide resolved
update_status(fetch_articles)
save_sync_status
end
dgorodnichy marked this conversation as resolved.
Show resolved Hide resolved

private

def ensure_sync_status_file_exists
sync_file_path = File.join(working_dir, SYNC_STATUS_FILE)

unless File.exist?(sync_file_path)
File.open(sync_file_path, 'w') { |file| file.write({}.to_yaml) }
end
end

def save_sync_status
File.open(File.join(working_dir, SYNC_STATUS_FILE), 'w') do |file|
file.write(@sync_status.to_yaml)
end
dgorodnichy marked this conversation as resolved.
Show resolved Hide resolved
end
dgorodnichy marked this conversation as resolved.
Show resolved Hide resolved

def fetch_articles
response = http_client.get_articles(USERNAME, 0)
JSON.parse(response.body)
end
dgorodnichy marked this conversation as resolved.
Show resolved Hide resolved
dgorodnichy marked this conversation as resolved.
Show resolved Hide resolved

def slug(article)
slug_parts = article['slug'].split('-')[0..-2]
tags = article['tags'] ? article['tags'].split(", ") : []
selected_tags = tags.first(2)
[slug_parts, selected_tags]
.flatten
.uniq
.reject { |segment| USELESS_WORDS.include?(segment) }
.compact
.join('-')
end
dgorodnichy marked this conversation as resolved.
Show resolved Hide resolved

def update_status(articles)
articles.each do |article|
id = article['id']
dgorodnichy marked this conversation as resolved.
Show resolved Hide resolved
edited_at = article["edited_at"] || article["created_at"]

@sync_status[id] ||= { edited_at: edited_at, slug: slug(article), synced: false }

if @sync_status[id][:edited_at] != edited_at
@sync_status[id][:edited_at] = edited_at
@sync_status[id][:synced] = false
end
end
end
dgorodnichy marked this conversation as resolved.
Show resolved Hide resolved
end
Loading
Loading