Inverting the Web 

We use search engines because the Web does not support accessing documents by anything other than URL. This puts a huge amount of control in the hands of the search engine company and those who control the DNS hierarchy.

Given that search engine companies can barely keep up with the constant barrage of attacks, commonly known as "SEO". intended to lower the quality of their results, a distributed inverted index seems like it would be impossible to build.

@freakazoid What methods *other* than URL are you suggesting? Because it is imply a Universal Resource Locator (or Identifier, as URI).

Not all online content is social / personal. I'm not understanding your suggestion well enough to criticise it, but it seems to have some ... capacious holes.

My read is that search engines are a necessity born of no intrinsic indexing-and-forwarding capability which would render them unnecessary. THAT still has further issues (mostly around trust)...

@freakazoid ... and reputation.

But a mechanism in which:

1. Websites could self-index.
2. Indexes could be shared, aggregated, and forwarded.
4. Search could be distributed.
5. Auditing against false/misleading indexing was supported.
6. Original authorship / first-publication was known

... might disrupt things a tad.

Somewhat more:
news.ycombinator.com/item?id=2

NB: the reputation bits might build off social / netgraph models.

But yes, I've been thinking on this.

@enkiv2 I know SEARX is: en.wikipedia.org/wiki/Searx

Also YaCy as sean mentioned.

There's also something that is/was used for Firefox keyword search, I think OpenSearch, a standard used by multiple sites, pioneered by Amazon.

Being dropped by Firefox BTW.

That provides a query API only, not a distributed index, though.

@freakazoid @drwho

@dredmorbius @enkiv2 @freakazoid YaCy isn't federated, but Searx is, yeah. YaCy is p2p.
@dredmorbius @enkiv2 @freakazoid Also, the initial criticism of the URL system isn't entirely there: the DNS is annoying, but isn't needed for accessing content on the WWW. You can directly navigate to public IP addresses and it works just as well, which allows you to skip the DNS. (You can even get HTTPS certs for IP addresses.)

Still centralized, which is bad, but centralized in a way that you can't really get around in internetworked communications.

@kick @enkiv2 @dredmorbius Not true; there are several decentralized routing systems out there. UIP, 6/4, Yggdrasil, Cjdns, I2P, and Tor hidden services to name just a few. Once you're no longer using names that are human-memorizable you can move to addresses that are public key hashes and thus self-certifying.

A system designed for content retrieval doesn't really need a way to refer to location at all. IPFS, for example, only needs content-based keys and signature-based keys.

@freakazoid @enkiv2 @dredmorbius I said _really_. None of those are human-readable (unlike IP). Non-human-readable systems miss the point of the WWW, web of trust stuff is awful and doesn't scale. Human readability in decentralized addressing is a solved problem (more or less) for addressing systems, but there's nothing good implementing the solution yet, so little point.

@kick I'm with you in advocating for human-readable systems. IPv4 is only very barely human-readable, almost entirely by techies. IPv6 simply isn't, nor are most other options.

Arguably DNS is reaching a non-human-readable status through TLD propogation.

Borrowing from some ideas I've been kicking around of search-as-identity (with ... possible additional elements to avoid spoof attacks), and the fact that HTTP's URL is *NOT* bound to DNS, there may be ways around this.

@enkiv2 @freakazoid

@kick I'll disagree with you that WoT doesn't scale, again, at least in part.

We rely on a mostly-localised WoT all the time in meatspace. Infotech networks' spatial-insensitivity makes this ... hard to replicate, but I'm not prepared to say it's _entirely_ impossible.

Addressing based on underlying identifiers, tied to more than just content (I'm pretty sure that _isn't_ ultimately sufficient), we might end up with _something_ useful.

@enkiv2 @freakazoid

@dredmorbius @enkiv2 @freakazoid WoT doesn't scale for average users. Technical users it does. WoT doesn't work over the phone, for example, or on e-mail, because people are easily convinced that malicious actors are within their WoT in targeted attacks. This is going to get worse esp. with recent FastSpeech & Tacotron publications/code releases.
Follow

@kick @enkiv2 @dredmorbius @freakazoid This body remembers when the definition of "geek" was someone who used a computer to exchange text chat messages to people. At least, that's what it meant at UCSC. Going back further, was it Augustine who was mightily impressed that Anselm could read without moving his lips?

Sign in to participate in the conversation
Mastodon on NerdCulture

All friendly creatures are welcome. Be excellent to each other, live humanism, no nazis, no hate speech. Not only for nerds, but the domain is somewhat cool. ;) No bots in general! (only with prior permission)