Goals of Kythe

(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: lawrence@krubner.com

Interesting. How to get different tools to work together, via some common specification? This sounds a bit like a new approach to the problems that the industry failed to solve 10 years ago with the insane WebServices approach.

The best way to view Kythe is as a “hub” for connecting tools for various languages, clients and build systems. By defining language-agnostic protocols and data formats for representing, accessing and querying source code information as data, Kythe allows language analysis and indexing to be run as services. This, in turn, enables lightweight (“thin”) composition of analysis tools with client tools such as editors, IDEs, and code browsers.

A hub-and-spoke model reduces the overall work to integrate L languages, C clients, and B build systems from a worst-case of O(L×C×B) — combinatorial in the size of the ecosystem — to O(L+C+B): Implementing Kythe compatibility for a given compiler, editor, or build system is, roughly, a constant up-front cost for each component, after which that component can interoperate with all the existing pieces directly.

To make this model work, Kythe provides a language-agnostic graph structure to capture build-system and compiler metadata, as well as semantic information about source code such as cross-references (e.g., definitions and their usages, type information, and cross-language associations). By design, the Kythe graph schema is liberal and extensible — we’ve defined a number of useful subgraphs, but new node and edge kinds are structured so that the graph can easily be extended without recourse to a central authority.

One of the basic design principles of Kythe is that interoperability should not be ‘all-or-nothing’: Tools should adjust gracefully to missing or incomplete data. For many purposes, we’ve found that some information is almost always better than none. At the same time, it is better to emit incomplete data than to emit incorrect data. In practice, the important point is that tools should not “give up” in the presence of incomplete data, as partial results are often still useful.

An oddly simple and innocent remark on Hacker News:

Programming language researchers always seem to miss the biggest selling point of manifest and (partly-)nominal typing. It’s the thing that happens after a C# programmer types a “.” in Visual Studio. Want someone to use a single language and voluntarily turn the play-doh of a prototype into a sound concrete foundation via gradual typing? Tell them they will get full IntelliSense.

…By the way, for those who didn’t understand what Kythe[1] was all about: THIS is what it’s all about. It’s possible without types, but it’s MUCH better with them.

[1] http://www.kythe.io/docs/schema/

This is a bit more reasonable:

I like IntelliSense and its just-as-good brethren mostly in Java-focused IDEs, but my perspective is that it’s a valuable but small bonus, rather than anything like a 15-year leap of progress. It’s nice, but it just isn’t that important, and it’s very annoying when people act like if you don’t see it as some sort of Best. Feature. Ever. you’re living in the past.

IntelliSense isn’t even close to the coolest thing about good type systems. What’s cool about them is catching dumb errors and contract violations statically and reducing the number of unit tests that it takes to feel confident in a piece of software.

Kythe is a really great project, though.