Skip to content

Commit a9cca68

Browse files
committed
fix(blog): make post 100 less templated, drop academic going-deeper
1 parent 3dede2f commit a9cca68

8 files changed

Lines changed: 102 additions & 54 deletions

File tree

front/public/blog/audio/curiosities/100-posts-knowledge-graph-retrospective.json

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,11 @@
33
"category": "curiosities",
44
"lang": "en",
55
"voice": "en-US-AndrewMultilingualNeural",
6-
"hash": "b0f5ddbc5ad3c253abadf5be1e7e907d9c11b8370a9845493888f3a52e6ec2d0",
7-
"sourceHash": "b0f5ddbc5ad3c253abadf5be1e7e907d9c11b8370a9845493888f3a52e6ec2d0",
8-
"durationSec": 2151.432,
9-
"byteSize": 12908592,
10-
"narrationWordCount": 5134,
6+
"hash": "153ee03744cda0ec6b32307fb7587512050137da8013f41ce47616fa89dd6ec8",
7+
"sourceHash": "153ee03744cda0ec6b32307fb7587512050137da8013f41ce47616fa89dd6ec8",
8+
"durationSec": 1120.68,
9+
"byteSize": 6724080,
10+
"narrationWordCount": 5179,
1111
"audioUrl": "/blog/audio/curiosities/100-posts-knowledge-graph-retrospective.mp3",
1212
"translationModel": ""
1313
}
Binary file not shown.

front/public/blog/audio/manifest.json

Lines changed: 27 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,21 @@
11
{
22
"lang": "en",
33
"voice": "en-US-AndrewMultilingualNeural",
4-
"generatedAt": 1777684012,
4+
"generatedAt": 1777694696,
55
"posts": {
6+
"curiosities/100-posts-knowledge-graph-retrospective": {
7+
"slug": "100-posts-knowledge-graph-retrospective",
8+
"category": "curiosities",
9+
"lang": "en",
10+
"voice": "en-US-AndrewMultilingualNeural",
11+
"hash": "153ee03744cda0ec6b32307fb7587512050137da8013f41ce47616fa89dd6ec8",
12+
"sourceHash": "153ee03744cda0ec6b32307fb7587512050137da8013f41ce47616fa89dd6ec8",
13+
"durationSec": 1120.68,
14+
"byteSize": 6724080,
15+
"narrationWordCount": 5179,
16+
"audioUrl": "/blog/audio/curiosities/100-posts-knowledge-graph-retrospective.mp3",
17+
"translationModel": ""
18+
},
619
"curiosities/algebraic-number-theory-when-factorization-breaks": {
720
"slug": "algebraic-number-theory-when-factorization-breaks",
821
"category": "curiosities",
@@ -1043,6 +1056,19 @@
10431056
"audioUrl": "/blog/audio/field-notes/sql-pandas-pyspark-duckdb.mp3",
10441057
"translationModel": ""
10451058
},
1059+
"field-notes/stack-recommendations-after-100-posts": {
1060+
"slug": "stack-recommendations-after-100-posts",
1061+
"category": "field-notes",
1062+
"lang": "en",
1063+
"voice": "en-US-AndrewMultilingualNeural",
1064+
"hash": "a97b6908e71478693fc3cd8f440419222d1536e25f05652de111afefb1b060df",
1065+
"sourceHash": "a97b6908e71478693fc3cd8f440419222d1536e25f05652de111afefb1b060df",
1066+
"durationSec": 2286.744,
1067+
"byteSize": 13720464,
1068+
"narrationWordCount": 5638,
1069+
"audioUrl": "/blog/audio/field-notes/stack-recommendations-after-100-posts.mp3",
1070+
"translationModel": ""
1071+
},
10461072
"field-notes/structuring-ml-projects": {
10471073
"slug": "structuring-ml-projects",
10481074
"category": "field-notes",

front/public/blog/posts/curiosities/100-posts-knowledge-graph-retrospective.md

Lines changed: 9 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ Reading a blog as a chronology is the obvious move and almost always the wrong o
2929

3030
The corpus is about its structure. Every post has tags. Every tag connects to other tags through co-occurrence on the same post. Every two posts that share a tag are connected; two posts that share several tags are connected through a thicker rope. Concepts strung across posts thread the whole corpus together. What you get when you draw this is a graph: nodes are posts (or tags, depending on how you project), edges are co-mentions, weights are how often two things travel together.
3131

32-
This is not a metaphor. It is the same construction the network science literature has been refining for fifty years. Granovetter's weak ties, Newman's modularity, the Louvain method, eigenvector centrality, betweenness — all of it works on any graph you can construct, and a tag-co-occurrence graph from a blog corpus is, structurally, no different from a citation network or a protein interaction network. The math does not know the nodes are blog posts. It will tell you the same things it tells everyone else: which nodes are central, which form natural clusters, which act as bridges, which are dangling out on the periphery.
32+
This is not a metaphor. It is the same construction network science uses for citation networks and protein interaction networks. The math does not know the nodes are blog posts. It will tell you the same things it tells everyone else: which nodes are central, which form natural clusters, which act as bridges, which are dangling out on the periphery.
3333

3434
The site already exposes a small graph view at `/blog/graph` that lets a reader navigate by clicking related posts. The view I want for this retrospective is the analytic one: not "what should I read next" but "what is the shape of the thing I have built." Graph theory gives me the vocabulary; network science gives me the verdicts.
3535

@@ -250,7 +250,7 @@ The pattern across the top 10 edges: about half are real intellectual co-occurre
250250

251251
The spine and the hubs tell you about gravity. Communities tell you about structure: which clusters of nodes are densely interconnected internally and only sparsely connected to the rest of the graph.
252252

253-
To detect communities I run the Louvain method on the tag-tag graph. Louvain — the [fast-unfolding algorithm by Blondel et al. (2008)](https://arxiv.org/abs/0803.0476) — maximizes [Newman's modularity](https://www.pnas.org/doi/10.1073/pnas.0601602103) greedily by repeatedly merging nodes into the community that gives the largest local gain, then collapsing the graph and repeating. Modularity, in plain language, is "edges-within-clusters minus edges-you-would-expect-by-chance." A high-modularity partition has dense communities and sparse cuts between them.
253+
To detect communities I run the Louvain method on the tag-tag graph. It maximizes modularity — roughly, "edges-within-clusters minus edges-you-would-expect-by-chance" — by greedily merging nodes into the community that gives the biggest local gain. A high-modularity partition has dense communities and sparse cuts between them.
254254

255255
```python
256256
import networkx.algorithms.community as nxcomm
@@ -500,7 +500,7 @@ This is the five-post agent arc the corpus has been pointing at for several mont
500500
4. [ontology-production-pipeline-gcp](https://juanlara18.github.io/portfolio/#/blog/ontology-production-pipeline-gcp)
501501
5. [ontology-to-agent-toolbox](https://juanlara18.github.io/portfolio/#/blog/ontology-to-agent-toolbox)
502502

503-
These three paths cover roughly half the corpus by tag overlap. The other half is reachable from any of them within two hops, which is the small-world property doing its work — but I will not claim small-world rigorously without measuring clustering coefficient against a Watts-Strogatz null model, and that is a sidebar I am leaving for another piece.
503+
These three paths cover roughly half the corpus by tag overlap. The other half is reachable from any of them within two hops. Two hops is short, which is the kind of property that makes a graph feel small.
504504

505505
---
506506

@@ -595,46 +595,14 @@ That is the whole promotion path. From drafted Markdown to a node in a graph in
595595

596596
---
597597

598-
## Going Deeper
598+
## A Closer
599599

600-
**Books:**
600+
I usually end posts with a Going Deeper section: books, papers, videos, questions to think about. This one does not get that. There is no canon to point you to here. The corpus *is* the canon I am pointing at, and the only honest follow-up is the next post.
601601

602-
- Newman, M. (2018). *Networks.* Oxford University Press, 2nd edition.
603-
- The graduate-level reference for everything in this post: degree distributions, modularity, betweenness, community detection, random graph models. If you only own one network science book, this is it.
604-
- Barabási, A.-L. (2016). *Network Science.* Cambridge University Press.
605-
- The free online edition at networksciencebook.com is the most accessible introduction. Strong on scale-free networks and the empirical regularities that make real-world graphs look like real-world graphs.
606-
- Watts, D. J. (2003). *Six Degrees: The Science of a Connected Age.* W. W. Norton.
607-
- The popular-audience companion to the small-world paper. Worth reading specifically for the chapters on how Watts and Strogatz arrived at the model. Lighter on math, heavier on intellectual history.
608-
- Easley, D., and Kleinberg, J. (2010). *Networks, Crowds, and Markets: Reasoning About a Highly Connected World.* Cambridge University Press.
609-
- Free online. Bridges the gap between graph theory and economics, with a long chapter on information cascades that resonates with the "why some posts spread and others do not" question I sidestepped here.
602+
Post #101, [stack-recommendations-after-100-posts](https://juanlara18.github.io/portfolio/#/blog/stack-recommendations-after-100-posts), is the practical companion to this one. Two halves of the same retrospective: this one is the shape of what I wrote; the next is what I would actually use today, knowing what I know after writing about hundreds of options.
610603

611-
**Online Resources:**
604+
If you want to run this analysis on your own corpus, the snippets above are enough. The numbers in this post came from `front/src/data/blogData.json` plus about a hundred lines of networkx; you can verify any claim by re-running the same code. That auditability turned out to be the thing I was after when I started writing — not posts you have to trust, but posts you can argue with.
612605

613-
- [networkx documentation](https://networkx.org/documentation/stable/) — Reference for every function I used in this post: `degree`, `betweenness_centrality`, `louvain_communities`, `modularity`. Read the user guide once and the reference will pay back the time.
614-
- [Network Science by Barabási, online edition](http://networksciencebook.com/) — Free, hyperlinked, with interactive figures. Chapters 4 (Scale-Free Networks) and 9 (Communities) are directly relevant to this retrospective.
615-
- [Stanford CS224W: Machine Learning with Graphs](https://web.stanford.edu/class/cs224w/) — Course materials are public. Goes deeper into graph machine learning, but the early lectures on graph statistics and community detection are excellent on their own.
616-
- [Cytoscape](https://cytoscape.org/) — If you want to actually visualize a personal corpus graph, export the edge list from networkx and load it into Cytoscape. The static images in this post do not do justice to what the graph looks like in motion.
606+
A hundred is an arbitrary number. The graph does not care. But arbitrary numbers are useful as forcing functions, and this one forced me to look at the dataset I had been generating without ever measuring. It turned out to know more about me than I knew about it.
617607

618-
**Videos:**
619-
620-
- [The Mathematics of Networks](https://www.youtube.com/watch?v=lETt7IcDWLI) by Steven Strogatz — Strogatz himself walking through small-world phenomena and the original 1998 paper. Good companion reading for the methodology section.
621-
- [Community Detection with the Louvain Algorithm](https://www.youtube.com/watch?v=0zuiLBOIcsw) — A focused, mathematical walkthrough of the algorithm I used to partition the tag graph in this post.
622-
623-
**Academic Papers:**
624-
625-
- Watts, D. J., and Strogatz, S. H. (1998). ["Collective dynamics of 'small-world' networks."](https://www.nature.com/articles/30918) *Nature*, 393(6684), 440–442.
626-
- The foundational small-world paper. The reason we expect any reasonably-connected graph to have short paths.
627-
- Newman, M. E. J. (2006). ["Modularity and community structure in networks."](https://www.pnas.org/doi/10.1073/pnas.0601602103) *PNAS*, 103(23), 8577–8582.
628-
- The definitional paper for modularity, the objective function the Louvain method maximizes. Its eigenvector-based formulation is also the cleanest derivation of the modularity matrix.
629-
- Blondel, V. D., Guillaume, J.-L., Lambiotte, R., and Lefebvre, E. (2008). ["Fast unfolding of communities in large networks."](https://arxiv.org/abs/0803.0476) *Journal of Statistical Mechanics: Theory and Experiment*, 2008(10), P10008.
630-
- The Louvain method paper. A greedy, modularity-maximizing community detection algorithm fast enough to run on graphs with billions of edges, and very much overkill for 223 nodes — but the standard tool, and the one I used here.
631-
- Granovetter, M. S. (1973). ["The strength of weak ties."](https://www.jstor.org/stable/2776392) *American Journal of Sociology*, 78(6), 1360–1380.
632-
- The conceptual origin of bridges-as-information-pathways. Reading this paper alongside the bridges section above is the cleanest way to understand why the three bridge posts I cited matter more than their individual readership numbers suggest.
633-
634-
**Questions to Explore:**
635-
636-
- If a personal blog graph is a low-dimensional embedding of its author, what other graphs in your life embed you in the same way? Your code repository commit graph? Your email reply graph? Your reading list? Are these embeddings consistent with each other, or do they reveal different selves?
637-
- The singleton tag problem is a measurable editorial KPI. What other corpus-level KPIs should a writer track? Average post-post path length? Modularity over time? Cluster size variance? Which of these are gameable and which are diagnostic?
638-
- The Louvain algorithm is greedy and stochastic. The communities it returns depend on the seed. How would you decide whether a community is "real" — that is, robust across many runs of the algorithm — versus an artifact of a specific seed? The literature has answers (consensus clustering, modularity over null models); would you accept them or look for stronger evidence?
639-
- Is there a *right* number of communities a personal blog should have? Too few and the corpus is one-dimensional; too many and it is incoherent. Five communities feels right to me at 99 posts. Should that scale linearly with corpus size, sub-linearly, or saturate?
640-
- What would post #200's retrospective look like? If you could fast-forward and see it now, which of the to-do items above would have been completed, which would have been ignored, and which would have been replaced by problems you cannot see today?
608+
Thanks for being here for any of these. The next one starts now.

front/public/rss.xml

Lines changed: 43 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,50 @@
99
<atom:link href="https://juanlara18.github.io/portfolio/rss.xml" rel="self" type="application/rss+xml" />
1010
<description>Technical writing on machine learning, AI agents, NLP, and data engineering — research notes, field notes, and curiosities.</description>
1111
<language>en-us</language>
12-
<lastBuildDate>Thu, 05 Aug 2027 00:00:00 GMT</lastBuildDate>
12+
<lastBuildDate>Thu, 19 Aug 2027 00:00:00 GMT</lastBuildDate>
1313
<generator>build-rss.js</generator>
14+
<item>
15+
<title>The Stack I Would Adopt After 100 Posts: An Opinionated Manifesto</title>
16+
<link>https://juanlara18.github.io/portfolio/blog/field-notes/stack-recommendations-after-100-posts</link>
17+
<guid isPermaLink="true">https://juanlara18.github.io/portfolio/blog/field-notes/stack-recommendations-after-100-posts</guid>
18+
<pubDate>Thu, 19 Aug 2027 00:00:00 GMT</pubDate>
19+
<description><![CDATA[The hundredth post was the structural retrospective. This is the practical one. After a hundred posts of saying it depends, here is the stack I would actually pick today, the books and papers that earned their place on my shelf, the patterns that proved their weight, and the ones I would refuse to deploy a second time.]]></description>
20+
<category>field-notes</category>
21+
<category>Production ML</category>
22+
<category>Best Practices</category>
23+
<category>MLOps</category>
24+
<category>LLMs</category>
25+
<category>RAG</category>
26+
<category>Agents</category>
27+
<category>Software Engineering</category>
28+
<category>Infrastructure</category>
29+
<category>Cloud Computing</category>
30+
<category>Evaluation</category>
31+
<category>AI Engineering</category>
32+
<category>Vector Databases</category>
33+
<enclosure url="https://pub-00d57ee081654fe389ef2660b8f38f69.r2.dev/audio/field-notes/stack-recommendations-after-100-posts.mp3" length="13720464" type="audio/mpeg" />
34+
<dc:creator>Juan Lara</dc:creator>
35+
</item>
36+
<item>
37+
<title>100 Posts as a Knowledge Graph: A Retrospective in Network Science</title>
38+
<link>https://juanlara18.github.io/portfolio/blog/curiosities/100-posts-knowledge-graph-retrospective</link>
39+
<guid isPermaLink="true">https://juanlara18.github.io/portfolio/blog/curiosities/100-posts-knowledge-graph-retrospective</guid>
40+
<pubDate>Thu, 12 Aug 2027 00:00:00 GMT</pubDate>
41+
<description><![CDATA[When you write 99 posts and then plot the result as a graph, the picture is not what you thought you were drawing. This is post number 100, and instead of a victory lap I ran the corpus through networkx: 99 nodes, 685k words, 223 tags, 2,408 tag-tag edges. What the structure reveals is more interesting than the chronology. There is a spine, four-and-a-half communities, a long tail of singleton tags that I tagged once and forgot, a handful of bridge posts holding the graph together, and a measurable bias toward production over theory. This is the blog reading itself, with real numbers, real cluster names, and the uncomfortable parts left in.]]></description>
42+
<category>curiosities</category>
43+
<category>Knowledge Graphs</category>
44+
<category>Graph Theory</category>
45+
<category>Mathematics</category>
46+
<category>Algorithms</category>
47+
<category>Software Engineering</category>
48+
<category>Best Practices</category>
49+
<category>Statistics</category>
50+
<category>Data Engineering</category>
51+
<category>Information Retrieval</category>
52+
<category>Computer Science</category>
53+
<enclosure url="https://pub-00d57ee081654fe389ef2660b8f38f69.r2.dev/audio/curiosities/100-posts-knowledge-graph-retrospective.mp3" length="12908592" type="audio/mpeg" />
54+
<dc:creator>Juan Lara</dc:creator>
55+
</item>
1456
<item>
1557
<title>Agent Engineering as a Discipline: Six Roles That Just Got Names</title>
1658
<link>https://juanlara18.github.io/portfolio/blog/field-notes/agent-engineering-disciplines</link>

front/public/sitemap.xml

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,17 +31,29 @@
3131
<priority>0.7</priority>
3232
</url>
3333
<url>
34-
<loc>https://juanlara18.github.io/portfolio/blog/category/research</loc>
34+
<loc>https://juanlara18.github.io/portfolio/blog/category/curiosities</loc>
3535
<lastmod>2026-05-02</lastmod>
3636
<changefreq>weekly</changefreq>
3737
<priority>0.7</priority>
3838
</url>
3939
<url>
40-
<loc>https://juanlara18.github.io/portfolio/blog/category/curiosities</loc>
40+
<loc>https://juanlara18.github.io/portfolio/blog/category/research</loc>
4141
<lastmod>2026-05-02</lastmod>
4242
<changefreq>weekly</changefreq>
4343
<priority>0.7</priority>
4444
</url>
45+
<url>
46+
<loc>https://juanlara18.github.io/portfolio/blog/field-notes/stack-recommendations-after-100-posts</loc>
47+
<lastmod>2027-08-19</lastmod>
48+
<changefreq>monthly</changefreq>
49+
<priority>0.7</priority>
50+
</url>
51+
<url>
52+
<loc>https://juanlara18.github.io/portfolio/blog/curiosities/100-posts-knowledge-graph-retrospective</loc>
53+
<lastmod>2027-08-12</lastmod>
54+
<changefreq>monthly</changefreq>
55+
<priority>0.7</priority>
56+
</url>
4557
<url>
4658
<loc>https://juanlara18.github.io/portfolio/blog/field-notes/agent-engineering-disciplines</loc>
4759
<lastmod>2027-08-05</lastmod>

front/src/data/blogData.json

Lines changed: 3 additions & 3 deletions
Large diffs are not rendered by default.

knowledge-base/posts.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"$schema_version": "1.0",
3-
"generated_at": "2026-05-02T03:54:42.571Z",
3+
"generated_at": "2026-05-02T03:58:51.943Z",
44
"manifest": {
55
"purpose": "Machine-readable index of the blog. Pair with knowledge-base/KNOWLEDGE_BASE.md for narrative context, reading paths, and cross-cutting views.",
66
"authoring_flow": "Edit knowledge-base/KNOWLEDGE_BASE.md (the curated source). Run `npm run build-knowledge-base` (or any `npm run build`) to regenerate this file.",

0 commit comments

Comments
 (0)