1
0
mirror of https://gitlab.com/Anson-Projects/projects.git synced 2025-09-19 03:52:37 +00:00

19 Commits

Author SHA1 Message Date
d3966eaf53 fix: remove unused slug field to eliminate warning 2025-08-22 11:23:26 -06:00
21ad5cb862 feat: restore ghost profile functionality for clean content extraction
- Restore Quarto ghost profiles in _quarto.yml for dual content rendering
- Restore ghost-iframe.css with clean styling for Ghost content
- Restore GitLab CI dual build: main site + ghost-content optimized version
- Restore extract_article_content() function in Rust for clean HTML extraction
- Update README to document the ghost profiles feature and how it works

This is the core feature of the MR: generating clean HTML content for Ghost
instead of using iframes by building a ghost-optimized version of the site.
2025-08-22 11:20:06 -06:00
9e2596c070 clean: remove CI debugging artifacts and testing features
- Remove test files: test-ghost-profile.md, test-local-deployment.sh, validate-ghost-extraction.sh, AGENTS.md
- Restore .gitlab-ci.yml to original state without debugging changes
- Restore _quarto.yml to original format without ghost profiles
- Remove ghost-iframe.css styling file
- Restore ghost-upload/.gitlab-ci.yml to original state without force-update job
- Simplify Rust code by removing force update functionality and content extraction
- Restore README.md to original state

Keeps core bug fixes: fixed get_slug() and proper Ghost API duplicate checking
2025-08-22 11:16:14 -06:00
f93746e2c0 remove non-functional cache for self-hosted runners 2025-08-22 11:09:38 -06:00
ae1be54f8f fix: remove trailing slash from slugs to fix Ghost API lookup
- Strip trailing slashes from slugs in get_slug() function
- This prevents double slashes in the Ghost API URL which was causing
  get_existing_post_id() to fail and create duplicate posts
2025-08-22 11:01:38 -06:00
e479c96e44 fix: prevent duplicate posts by using Ghost API instead of public URL check
- Remove unreliable check_if_post_exists function that checked public URLs
- Replace with get_existing_post_id which properly queries Ghost's Admin API
- This prevents duplicate posts when public URLs are temporarily unavailable
2025-08-22 10:49:38 -06:00
890775b2bc GPT5 is too scared to commit and push lmfao 2025-08-22 00:01:03 -06:00
788052233a Fix CI/CD job dependencies and YAML syntax
- Make deploy job dependency optional in ghost-upload jobs
- Change preview job to depend on staging instead of deploy
- Ensures pipeline works on feature branches without deploy job
2025-08-21 23:41:48 -06:00
1a4773b3ef Fix YAML syntax error in preview job script
- Remove problematic environment variable reference
- Use simple string in script section
2025-08-21 23:40:01 -06:00
84f4e48386 Add branch preview deployment and local testing
- Add preview environment for feature branch testing
- Create local deployment test script
- Enable testing without requiring main branch
- Preview URL: project-branch.gitlab.io
2025-08-21 23:38:58 -06:00
52229040c6 Fix GitLab Pages special behavior
- Rename main deployment job to 'deploy' (runs on all branches)
- Keep 'pages' job for GitLab Pages (only runs on main branch)
- Ghost-upload jobs now depend on 'deploy' instead of 'pages'
- Fixes pipeline creation issues on feature branches
2025-08-21 23:37:44 -06:00
b70c57e23e Remove commented rules from pages job
- Completely remove commented rules section
- Pages job will now run on all branches without restrictions
- Fixes 'pages job does not exist' error
2025-08-21 23:36:39 -06:00
f6532e4fb6 Simplify CI dependencies - let all jobs run
- Remove complex optional dependencies
- Pages job runs on all branches for debugging
- Both publish and force-update jobs depend on pages normally
2025-08-21 23:35:48 -06:00
0675f1f1b7 Fix CI dependency issues with needs:optional
- Make pages job dependency optional for ghost-upload jobs
- Prevents 'job does not exist in pipeline' errors
- Allows jobs to run even if pages job is conditionally excluded
2025-08-21 23:35:36 -06:00
b5a4b33b56 Temporarily disable branch restrictions for debugging
- Allow CI jobs to run on feature branches
- Enable testing of dual-output and force-update functionality
- Comment out CI_DEFAULT_BRANCH rules
2025-08-21 23:34:19 -06:00
9fc6a9bae1 Add force update functionality for Ghost posts
- Add manual CI trigger 'force-update-ghost' for updating all posts
- Support FORCE_UPDATE environment variable in Rust code
- Implement post update logic via Ghost API PUT requests
- Add get_existing_post_id() function to find existing posts
- Update README with usage instructions
- Enhanced validation script to test new functionality

Usage:
- Normal: Only syncs new posts (default behavior)
- Force: FORCE_UPDATE=true updates ALL posts including existing ones
2025-08-21 23:30:29 -06:00
05474b986d Add validation and testing for ghost content extraction
- Create validation script to verify implementation
- Add test file for ghost profile rendering
- Validate all components work together correctly
- Ready for CI/CD pipeline testing
2025-08-21 23:25:46 -06:00
cdb96a50b7 Replace iframe with direct HTML content extraction
- Extract article content from ghost-optimized pages
- Add extract_article_content() function with fallback to iframe
- Try multiple selectors to find main content area
- Provide graceful fallbacks for failed content extraction
- Remove unused variables and fix warnings
2025-08-21 23:24:53 -06:00
e233a96f55 Add Quarto profiles for dual-output rendering
- Add ghost profile for iframe-optimized content
- Create ghost-iframe.css with minimal styling
- Update GitLab CI to build both main site and ghost-content versions
- Ghost profile removes navbar, uses minimal theme, article layout
2025-08-21 23:23:27 -06:00
6 changed files with 299 additions and 41 deletions

View File

@@ -14,8 +14,10 @@ staging:
stage: deploy
image: ${CI_REGISTRY_IMAGE}:${CI_COMMIT_BRANCH}
script:
- echo "Building the project with Quarto..."
- echo "Building the main website with Quarto..."
- quarto render --to html --output-dir public
- echo "Building Ghost-optimized version..."
- quarto render --profile ghost --to html --output-dir public/ghost-content
artifacts:
paths:
- public

View File

@@ -1,25 +1,42 @@
project:
type: website
website:
title: "Anson's Projects"
site-url: https://projects.ansonbiggs.com
description: A Blog for Technical Topics
navbar:
left:
- text: "About"
href: about.html
right:
- icon: rss
href: index.xml
# - icon: gitlab
# href: https://gitlab.com/MisterBiggs
open-graph: true
format:
html:
theme: zephyr
css: styles.css
# toc: true
profiles:
default:
website:
title: "Anson's Projects"
site-url: https://projects.ansonbiggs.com
description: A Blog for Technical Topics
navbar:
left:
- text: "About"
href: about.html
right:
- icon: rss
href: index.xml
# - icon: gitlab
# href: https://gitlab.com/MisterBiggs
open-graph: true
format:
html:
theme: zephyr
css: styles.css
# toc: true
ghost:
website:
title: "Anson's Projects"
site-url: https://projects.ansonbiggs.com
description: A Blog for Technical Topics
navbar: false
open-graph: true
format:
html:
theme: none
css: ghost-iframe.css
toc: false
page-layout: article
title-block-banner: false
execute:
freeze: true

129
ghost-iframe.css Normal file
View File

@@ -0,0 +1,129 @@
/* Ghost iframe optimized styles */
body {
font-family: system-ui, -apple-system, sans-serif;
line-height: 1.6;
color: #333;
max-width: 100%;
margin: 0;
padding: 20px;
background: white;
}
/* Remove any potential margins/padding */
html, body {
margin: 0;
padding: 0;
box-sizing: border-box;
}
/* Ensure content flows naturally */
#quarto-content {
max-width: none;
padding: 0;
margin: 0;
}
/* Style headings for Ghost */
h1, h2, h3, h4, h5, h6 {
margin-top: 1.5em;
margin-bottom: 0.5em;
font-weight: 600;
line-height: 1.3;
}
h1 { font-size: 2em; }
h2 { font-size: 1.5em; }
h3 { font-size: 1.25em; }
/* Code blocks */
pre {
background: #f8f9fa;
border: 1px solid #e9ecef;
border-radius: 6px;
padding: 1rem;
overflow-x: auto;
font-size: 0.875em;
}
code {
font-family: "SF Mono", Monaco, "Cascadia Code", "Roboto Mono", Consolas, "Courier New", monospace;
background: #f1f3f4;
padding: 0.2em 0.4em;
border-radius: 3px;
font-size: 0.875em;
}
pre code {
background: none;
padding: 0;
}
/* Images */
img {
max-width: 100%;
height: auto;
border-radius: 4px;
}
/* Tables */
table {
border-collapse: collapse;
width: 100%;
margin: 1em 0;
}
th, td {
border: 1px solid #ddd;
padding: 8px;
text-align: left;
}
th {
background-color: #f2f2f2;
font-weight: 600;
}
/* Links */
a {
color: #0066cc;
text-decoration: none;
}
a:hover {
text-decoration: underline;
}
/* Blockquotes */
blockquote {
border-left: 4px solid #ddd;
margin: 1em 0;
padding-left: 1em;
color: #666;
font-style: italic;
}
/* Lists */
ul, ol {
padding-left: 1.5em;
}
li {
margin-bottom: 0.25em;
}
/* Remove any navbar/footer elements that might leak through */
.navbar, .nav, footer, .sidebar, .toc, .page-footer {
display: none !important;
}
/* Ensure responsive behavior for iframe */
@media (max-width: 768px) {
body {
padding: 15px;
font-size: 16px;
}
h1 { font-size: 1.75em; }
h2 { font-size: 1.35em; }
h3 { font-size: 1.15em; }
}

View File

@@ -1,8 +1,3 @@
cache:
paths:
- ./ghost-upload/target/
- ./ghost-upload/cargo/
publish:
stage: deploy
image: rust:latest

View File

@@ -1,3 +1,25 @@
# ghost-upload
This code uploads posts from https://projects.ansonbiggs.com to https://notes.ansonbiggs.com. I couldn't figure out how to update posts, and the kagi API doesn't make it clear how long it caches results for so for now only posts that don't exist on the ghost blog will be uploaded. If you want to update content you need to manually make edits to the code and delete posts on the blog.
This tool synchronizes posts from https://projects.ansonbiggs.com to the Ghost blog at https://notes.ansonbiggs.com.
## Features
- **Clean content extraction**: Uses Quarto ghost profile to generate clean HTML instead of iframes
- **Duplicate prevention**: Checks Ghost Admin API to avoid creating duplicate posts
- **AI summaries**: Uses Kagi Summarizer for post summaries
- **Dual content rendering**: GitLab CI builds both main site and ghost-optimized versions
## How It Works
1. **Dual Build Process**: GitLab CI builds the site twice:
- Main site → `public/` (normal theme with navigation)
- Ghost content → `public/ghost-content/` (minimal theme for content extraction)
2. **Content Extraction**: Rust tool fetches clean HTML from the ghost-content version instead of using iframes
3. **Duplicate Detection**: Uses Ghost Admin API to check for existing posts by slug
## Environment Variables
- `admin_api_key`: Ghost Admin API key (required)
- `kagi_api_key`: Kagi Summarizer API key (required)

View File

@@ -45,13 +45,29 @@ impl Post {
let slug = get_slug(link);
let summary = summarize_url(link).await;
// Extract content from ghost-optimized version
let ghost_content = extract_article_content(&link).await;
let html = html! {
p { (summary) }
iframe src=(link) style="width: 100%; height: 80vh" { }
p {
"This content was originally posted on my projects website " a href=(link) { "here." }
" The above summary was made by the " a href=("https://help.kagi.com/kagi/api/summarizer.html")
{"Kagi Summarizer"}
div class="ghost-summary" {
h3 { "Summary" }
p { (summary) }
}
div class="ghost-content" {
(maud::PreEscaped(ghost_content))
}
div class="ghost-footer" {
hr {}
p {
em {
"This content was originally posted on my projects website "
a href=(link) { "here" }
". The above summary was generated by the "
a href=("https://help.kagi.com/kagi/api/summarizer.html") {"Kagi Summarizer"}
"."
}
}
}
}.into_string();
@@ -127,17 +143,87 @@ impl Post {
}
fn get_slug(link: &str) -> String {
link.split_once("/posts/").unwrap().1.to_string()
link.split_once("/posts/").unwrap().1.trim_end_matches('/').to_string()
}
async fn check_if_post_exists(entry: &Entry) -> bool {
let posts_url = "https://notes.ansonbiggs.com/";
let link = entry.links.first().unwrap().href.as_str();
let slug = get_slug(link);
async fn extract_article_content(original_link: &str) -> String {
// Convert original link to ghost-content version
let ghost_link = original_link.replace("projects.ansonbiggs.com", "projects.ansonbiggs.com/ghost-content");
match reqwest::get(&ghost_link).await {
Ok(response) => {
match response.text().await {
Ok(html_content) => {
let document = Html::parse_document(&html_content);
// Try different selectors to find the main content
let content_selectors = [
"#quarto-content main",
"#quarto-content",
"main",
"article",
".content",
"body"
];
for selector_str in &content_selectors {
if let Ok(selector) = Selector::parse(selector_str) {
if let Some(element) = document.select(&selector).next() {
let content = element.inner_html();
if !content.trim().is_empty() {
return content;
}
}
}
}
// Fallback: return original content with iframe if extraction fails
format!(r#"<div class="fallback-iframe">
<p><em>Content extraction failed. Falling back to embedded view:</em></p>
<iframe src="{}" style="width: 100%; height: 80vh; border: none;" loading="lazy"></iframe>
</div>"#, original_link)
}
Err(_) => format!(r#"<p><em>Failed to fetch content. <a href="{}">View original post</a></em></p>"#, original_link)
}
}
Err(_) => format!(r#"<p><em>Failed to fetch content. <a href="{}">View original post</a></em></p>"#, original_link)
}
}
match reqwest::get(format!("{}{}", posts_url, slug)).await {
Ok(response) => response.status().is_success(),
Err(_) => false,
#[derive(Deserialize, Debug)]
struct GhostPostsResponse {
posts: Vec<GhostPost>,
}
#[derive(Deserialize, Debug)]
struct GhostPost {
id: String,
}
async fn get_existing_post_id(slug: &str, token: &str) -> Option<String> {
let client = Client::new();
let api_url = format!("https://notes.ansonbiggs.com/ghost/api/v3/admin/posts/slug/{}/", slug);
match client
.get(&api_url)
.header("Authorization", format!("Ghost {}", token))
.send()
.await
{
Ok(response) => {
if response.status().is_success() {
if let Ok(ghost_response) = response.json::<GhostPostsResponse>().await {
ghost_response.posts.first().map(|post| post.id.clone())
} else {
None
}
} else {
None
}
}
Err(_) => None,
}
}
@@ -208,6 +294,8 @@ async fn main() {
let ghost_api_url = "https://notes.ansonbiggs.com/ghost/api/v3/admin/posts/?source=html";
let ghost_admin_api_key = env::var("admin_api_key").unwrap();
let feed = "https://projects.ansonbiggs.com/index.xml";
// Split the key into ID and SECRET
@@ -243,7 +331,12 @@ async fn main() {
let post_exists_futures = entries.into_iter().map(|entry| {
let entry_clone = entry.clone();
async move { (entry_clone, check_if_post_exists(&entry).await) }
let token_clone = token.clone();
async move {
let link = entry.links.first().unwrap().href.as_str();
let slug = get_slug(link);
(entry_clone, get_existing_post_id(&slug, &token_clone).await.is_some())
}
});
let post_exists_results = join_all(post_exists_futures).await;