Skip to content

Commit 3dede2f

Browse files
committed
fix(blog): remove 2027 timeline claims and false day-job framing in post 100
1 parent 9b2a636 commit 3dede2f

4 files changed

Lines changed: 1254 additions & 998 deletions

File tree

front/public/blog/posts/curiosities/100-posts-knowledge-graph-retrospective.md

Lines changed: 8 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ estimatedWordCount: 7100
1313

1414
When you write 99 posts and then plot the result as a graph, the picture is not what you thought you were drawing.
1515

16-
I started the blog in January 2025 with a piece on the Rubik's cube and group theory. The plan, to the extent there was one, was to write a few math curiosities on the side, mostly because the Knowledge Data Engineer day job left no place to put long-form derivations. Sixteen months later I have 99 posts, 685,421 words, 223 distinct tags, and a habit. This piece is post number 100. It is also the first time I have looked at the corpus the way I would look at any other dataset I owned: dump it to JSON, load it into networkx, ask the graph what it knows.
16+
I started the blog in January 2025 with a piece on the Rubik's cube and group theory. The plan, to the extent there was one, was to write a few math curiosities on the side and see what stuck. What stuck turned out to be a habit: 99 posts, 685,421 words, 223 distinct tags. This piece is post number 100. It is also the first time I have looked at the corpus the way I would look at any other dataset I owned: dump it to JSON, load it into networkx, ask the graph what it knows.
1717

1818
The answer is not what I expected. I thought I had been writing about LLMs and ontologies on top of a foundation in math and software engineering. The graph thinks I have been writing about LLMs and Production ML on top of a foundation in math and software engineering, with everything else clustered around those two gravity wells. I thought my curiosities posts were a vibrant side garden. The graph thinks they are a bright, dense, undersized continent that connects to the rest of the corpus through exactly two bridges. I thought my tagging discipline was reasonable. The graph thinks 39 percent of my tags are dead nodes, used once and never again.
1919

@@ -57,38 +57,19 @@ Before any analysis, the raw shape. All numbers come from running an analytics p
5757
| Tag-tag edges | 2,408 |
5858
| Singleton tags | 87 (39 percent of all tags) |
5959

60-
The split by category and by year:
60+
The split by category:
6161

6262
| Category | Count | Share |
6363
|---|---|---|
6464
| field-notes | 73 | 73.7 percent |
6565
| curiosities | 14 | 14.1 percent |
6666
| research | 12 | 12.1 percent |
6767

68-
| Year | Posts | Cumulative |
69-
|---|---|---|
70-
| 2025 | 12 | 12 |
71-
| 2026 | 56 | 68 |
72-
| 2027 | 31 | 99 |
73-
7468
A few observations from the numbers alone, before we touch the graph.
7569

76-
First, the corpus is dominated by field-notes. Roughly seventy-three of every hundred posts are practical writeups; only fourteen are curiosities and twelve are research deep-dives. This is the first signal that the blog is less of a balanced trio and more of a single applied-engineering torso with two small intellectual wings. It is also, in retrospect, a faithful reflection of how I spend my time: at a financial institution shipping production agents and data pipelines, with curiosities and research relegated to evenings.
77-
78-
Second, the cadence story is not the slow climb you might guess. 2025 was a slow start: 12 posts spread over the year, roughly one a month. 2026 was an explosion: 56 posts, more than four a month, almost certainly correlated with the moment I admitted to myself that this was a real practice. 2027 partially regressed to a sustainable cadence: 31 posts in roughly seven months, around four a month again, but with longer pieces.
70+
First, the corpus is dominated by field-notes. Roughly seventy-three of every hundred posts are practical writeups; only fourteen are curiosities and twelve are research deep-dives. This is the first signal that the blog is less of a balanced trio and more of a single applied-engineering torso with two small intellectual wings. It is also, in retrospect, a faithful reflection of how the writing has evolved: most of the posts are practical writeups from data and ML work, with curiosities and research relegated to evenings.
7971

80-
Third, the median post is 6,500 words. The mean is also high: 685,421 over 99 is roughly 6,923. This is not a list-blog. It is closer to a textbook with chapters that happen to be marketed as posts. The single longest piece, [reinforcement-learning-first-principles](https://juanlara18.github.io/portfolio/#/blog/reinforcement-learning-first-principles), is 18,000 words, which is a small book.
81-
82-
```mermaid
83-
timeline
84-
title Cumulative posts by year
85-
2025 H1 : 0 posts
86-
2025 H2 : 12 posts cumulative
87-
2026 H1 : ~40 posts cumulative
88-
2026 H2 : 68 posts cumulative
89-
2027 H1 : 92 posts cumulative
90-
2027 H2 : 99 posts cumulative, retrospective at 100
91-
```
72+
Second, the median post is 6,500 words. The mean is also high: 685,421 over 99 is roughly 6,923. This is not a list-blog. It is closer to a textbook with chapters that happen to be marketed as posts. The single longest piece, [reinforcement-learning-first-principles](https://juanlara18.github.io/portfolio/#/blog/reinforcement-learning-first-principles), is 18,000 words, which is a small book.
9273

9374
---
9475

@@ -331,11 +312,11 @@ flowchart TB
331312

332313
A few honest observations.
333314

334-
**The LLM/RAG/Agents cluster is the largest community by far.** This is the gravity well I mentioned earlier. It absorbs new posts at the highest rate, and it has been the most active region of the blog through 2026 and 2027.
315+
**The LLM/RAG/Agents cluster is the largest community by far.** This is the gravity well I mentioned earlier. It absorbs new posts at the highest rate, and it has been the most active region of the blog over the last year.
335316

336317
**The Math/Curiosities community is the smallest of the five but has the highest concept density per post.** A typical curiosities post has 8–10 tags, of which 5–6 are within the cluster. The cluster is small because there are only 14 curiosities posts and they all live in the same neighborhood. It has high quality per node and low coverage. This is the community I am most under-investing in.
337318

338-
**The Knowledge Graphs / Ontology community is the youngest.** Most of its posts are from 2027. It is also the cluster with the strongest internal coherence: the ontology arc was deliberately written as a sequence ([ontologies-building-knowledge-bases](https://juanlara18.github.io/portfolio/#/blog/ontologies-building-knowledge-bases), [knowledge-graphs-practice](https://juanlara18.github.io/portfolio/#/blog/knowledge-graphs-practice), [tbox-abox-schema-facts-distinction](https://juanlara18.github.io/portfolio/#/blog/tbox-abox-schema-facts-distinction), [modular-ontologies-core-domains-pattern](https://juanlara18.github.io/portfolio/#/blog/modular-ontologies-core-domains-pattern), [ontology-production-pipeline-gcp](https://juanlara18.github.io/portfolio/#/blog/ontology-production-pipeline-gcp), [ontology-to-agent-toolbox](https://juanlara18.github.io/portfolio/#/blog/ontology-to-agent-toolbox)), and the sequencing shows up as tight modularity in the Louvain partition.
319+
**The Knowledge Graphs / Ontology community is the youngest.** Most of its posts are recent. It is also the cluster with the strongest internal coherence: the ontology arc was deliberately written as a sequence ([ontologies-building-knowledge-bases](https://juanlara18.github.io/portfolio/#/blog/ontologies-building-knowledge-bases), [knowledge-graphs-practice](https://juanlara18.github.io/portfolio/#/blog/knowledge-graphs-practice), [tbox-abox-schema-facts-distinction](https://juanlara18.github.io/portfolio/#/blog/tbox-abox-schema-facts-distinction), [modular-ontologies-core-domains-pattern](https://juanlara18.github.io/portfolio/#/blog/modular-ontologies-core-domains-pattern), [ontology-production-pipeline-gcp](https://juanlara18.github.io/portfolio/#/blog/ontology-production-pipeline-gcp), [ontology-to-agent-toolbox](https://juanlara18.github.io/portfolio/#/blog/ontology-to-agent-toolbox)), and the sequencing shows up as tight modularity in the Louvain partition.
339320

340321
**The Foundations / ML internals / SE community is the most heterogeneous.** It mixes posts on Python, on bash, on Docker, on git, on Kubernetes, on file formats, on hashing, on software-engineering classics, with a few ML-internals posts. The community holds together because all of these posts share the "engineering hygiene" angle, not because they share a topic.
341322

@@ -509,9 +490,9 @@ The graph is also a navigation device. Three reading paths, each derived from de
509490
4. [query-routing-agent-decisions](https://juanlara18.github.io/portfolio/#/blog/query-routing-agent-decisions)
510491
5. [agent-engineering-disciplines](https://juanlara18.github.io/portfolio/#/blog/agent-engineering-disciplines)
511492

512-
This is the five-post agent arc the corpus has been pointing at since early 2027. The arc ended at post #99, [agent-engineering-disciplines](https://juanlara18.github.io/portfolio/#/blog/agent-engineering-disciplines), and the natural continuation is in the upcoming stack-recommendations post #101.
493+
This is the five-post agent arc the corpus has been pointing at for several months. The arc ended at post #99, [agent-engineering-disciplines](https://juanlara18.github.io/portfolio/#/blog/agent-engineering-disciplines), and the natural continuation is in the upcoming stack-recommendations post #101.
513494

514-
**Path C: Ontologies to Action.** The 2027 ontology arc, designed as a sequence.
495+
**Path C: Ontologies to Action.** The ontology arc, designed as a sequence.
515496

516497
1. [ontologies-building-knowledge-bases](https://juanlara18.github.io/portfolio/#/blog/ontologies-building-knowledge-bases)
517498
2. [tbox-abox-schema-facts-distinction](https://juanlara18.github.io/portfolio/#/blog/tbox-abox-schema-facts-distinction)

front/src/data/blogData.json

Lines changed: 74 additions & 2 deletions
Large diffs are not rendered by default.

knowledge-base/KNOWLEDGE_BASE.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -269,8 +269,9 @@ pagerank-eigenvectors:
269269
270270
Auto-generated index of every post by category, sorted most recent first. Use this when you need a complete inventory of what the blog covers — for example, when loaded as Claude Project knowledge and you cannot query `posts.json`.
271271

272-
### field-notes (73 posts)
272+
### field-notes (74 posts)
273273

274+
- **`stack-recommendations-after-100-posts`** *(deep)* — The Stack I Would Adopt After 100 Posts: An Opinionated Manifesto. The hundredth post was the structural retrospective. This is the practical one. After a hundred posts of saying it depends, here is the stack I would actually pick today, the books and papers that earned their place on my shelf, the patterns that proved their weight, and the ones I would refuse to deploy a second time. Concepts: production ml, best practices, mlops, llms, rag, agents.
274275
- **`agent-engineering-disciplines`** *(deep)* — Agent Engineering as a Discipline: Six Roles That Just Got Names. By 2026 the people who keep production agents alive had stopped calling themselves AI engineers and started using more specific titles. Context Engineer. Memory Engineer. Harness Engineer. Tool Engineer. Eval Engineer. Identity and Policy Engineer. This post is a tour of those six disciplines: what each one owns, the artifacts they produce, the named effects they fight, the anti-patterns that keep biting, and an honest projection of which roles will consolidate, which will be absorbed by vendors, and which are LinkedIn theater. Concepts: agents, agentic ai, llms, production ml, best practices, software engineering.
275276
- **`knowledge-catalog-vs-ontologies`** *(deep)* — Knowledge Catalog vs Ontologies: A Confluence, Not a Replacement. Google's Knowledge Catalog and a domain ontology look like they answer the same question. They do not. One is an asset registry with governance and lineage; the other is a formal model of meaning with inferential reasoning. A mature knowledge layer almost always needs both, with a clear arrow of dependency between them. This post is the four-part arc's closing piece, naming the substitutions, the anti-patterns, and the honest hybrid architecture. Concepts: knowledge graphs, ontologies, ontology engineering, gcp, data architecture, agents.
276277
- **`gemini-enterprise-knowledge-catalog-deep-dive`** *(deep)* — Gemini Enterprise and the Knowledge Catalog: Two Buildings, Room by Room. The Cloud Next 26 overview gave you the map. This post zooms in on the two pieces that will reshape a Knowledge Data Engineer's day-to-day in the next twelve months: the Gemini Enterprise Agent Platform as a control plane, and the Knowledge Catalog as the semantic spine that grounds every agent answer in audited enterprise truth. Concepts: google cloud, vertex ai, agents, agentic ai, knowledge graphs, data architecture.
@@ -360,8 +361,9 @@ Auto-generated index of every post by category, sorted most recent first. Use th
360361
- **`embeddings-geometry-of-meaning`** *(working)* — Embeddings: The Geometry of Meaning. How do you teach a computer what 'king' means? You don't explain—you show it where 'king' lives in a space where meaning has coordinates. A deep dive into embeddings, from Word2Vec to modern sentence transformers, and why representing concepts as vectors changed everything. Concepts: embeddings, vector space, cosine similarity, manifold structure.
361362
- **`attention-is-all-you-need`** *(intro)* — Attention is All You Need: Understanding the Transformer Revolution. How a single elegant idea—pure attention—toppled decades of sequential thinking and sparked the AI revolution. A deep dive into the architecture that changed everything. Concepts: transformers, deep learning, nlp, attention, research papers, neural network theory.
362363

363-
### curiosities (14 posts)
364+
### curiosities (15 posts)
364365

366+
- **`100-posts-knowledge-graph-retrospective`** *(deep)* — 100 Posts as a Knowledge Graph: A Retrospective in Network Science. When you write 99 posts and then plot the result as a graph, the picture is not what you thought you were drawing. This is post number 100, and instead of a victory lap I ran the corpus through networkx: 99 nodes, 685k words, 223 tags, 2,408 tag-tag edges. What the structure reveals is more interesting than the chronology. There is a spine, four-and-a-half communities, a long tail of singleton tags that I tagged once and forgot, a handful of bridge posts holding the graph together, and a measurable bias toward production over theory. This is the blog reading itself, with real numbers, real cluster names, and the uncomfortable parts left in. Concepts: knowledge graphs, graph theory, mathematics, algorithms, software engineering, best practices.
365367
- **`network-science-communities-centrality`** *(deep)* — Network Science: Communities, Centrality, and Small Worlds. Graph theory gives you the language. Network science asks: what does a graph's structure tell you about the system it represents? From Granovetter's weak ties to Barabasi's scale-free hubs, this is the science of extracting meaning from connections -- who matters most, who belongs together, and why real networks look nothing like random ones. Concepts: mathematics, graph theory, algorithms, probability, data science, statistics.
366368
- **`graph-theory-mathematics-of-connections`** *(deep)* — Graph Theory: The Mathematics of Connections. From Euler's walk across seven bridges in 1736 to the mathematics that powers social networks, recommendation systems, and neural networks -- graph theory is the language of connections. This is the foundation that every algorithm on networks assumes you already know. Concepts: mathematics, graph theory, algorithms, computer science, topology, combinatorics.
367369
- **`ramanujan-constant-almost-integer`** *(deep)* — Ramanujan's Constant: Why e^(pi*sqrt(163)) Is Almost an Integer. The number e^(pi*sqrt(163)) misses being an integer by about 7.5 x 10^-13. This is not a coincidence -- it is a consequence of 163 being a Heegner number, where the j-invariant, complex multiplication, and the class number one problem converge into one of the most beautiful near-misses in all of mathematics. Concepts: mathematics, number theory, complex analysis, series, foundations of mathematics, algorithms.

0 commit comments

Comments
 (0)