Author here. Amusingly, I wanted to link to the HN discussion on the page, but couldn't figure out how to... I don't think it's possible! Something I hadn't considered until just now with a content addressable web is there are no cycles in the links.
Edit: Another amusing anecdote. I've been following this on log2viz (the heroku realtime performance monitoring tool), and was aghast to see high response times and memory usage. Then I realized I was actually looking at the dashboard for one of my rails apps! Phoenix is sitting pretty at about 35MB memory usage, and a median response time of 5ms, so far.
I once solved a similar problem by creating a link at tinyurl and then updating that link to point to whatever the name of my uploaded file had become :)
This is a problem that Freenet solved a long time ago. It means you need public-key cryptography rather than hashes for security, but they support updating pages--and, of course, any and all leaf content can still use hashes for better cacheability.
It's worth noting that WebRTC still requires a server of SOME sort -- typically you want a "signaling" server, and you need at least a STUN server to pierce firewalls, though the latter can be found for free.
And some corporate networks are too restrictive to be pierced by STUN, which means you need a TURN server, which is a relay server.
But yes, 98% of the time, WebRTC can go directly browser to browser. As long as neither browser is Internet Explorer. Sigh.
Is it even possible to open a socket from a browser to another browser directly? I was under the impression that was restricted for security reasons (websites becoming botnets and such).
An idea to make this(and the editable content suggestion in another comment) possible could be to add something to the protocol to define "previous versions" sha hashes.
If you want to edit a page then create a new page which includes the sha hash of the previous versions. If you get a request for the content of a sha hash of an old version, you could suggest your sha hash as an updated version.
Hmm, ok, but that is just an implementation detail. I just see a really simlple service here that could become really beautiful if one could manipulate the content, once it is in your browser, so the next time it is passed along, it will have evolved. Into what? I don't know. But what if I seeded the system with an image, a simple drawing or a shape, vector or whatever, and then just sat back, observed how other people took it further. Like graffiti. Forking of content could be insteresting. Or a github, but a peer-to-peer version?
Doesn't git use hashes also? The old hash could redirect to the new "commit" hash, keeping the cycle and allowing for updates at the same time. One of those updates could be the page "changelog".
Yes, git does use hashes, but git similarly doesn't work in that manner for the same reason. Git commits can only point to commits in the past and not future commits. It is not possible to update old commits like you suggest.
The HEAD of a repository is like a pointer to the hash of a commit. You may think of HEAD like a repointable alias. `cat .git/refs/heads/master` in any of your git repositories to see what I mean.
It changes the content of the commit (by iterating the hash in the commit message itself) until the hash of the child commit matches. A big part of the trick is that it only looks at a prefix of the hash, so the search space is much smaller than the full SHA1 hash.
I just used firebug to alter the page in my browser. I changed the first heading from "This page exists only if someone is looking at it" to "What happens if I use Firebug to alter this page in-situ. Do I end up with malware?".
Will this change propagate? Has anyone yet seen this modified page in their own browser?
Edit:
From the original page: "The server double checks the hash of the content it gets and then passes it along to the new person." I guess the answer is that the change won't propagate?
Correct, the change won't propagate for the reasons you say. (See source [0]). Also, though, note that the content comes to you as a blob of JSON, which gets rendered into your page, and you respond to future content requests with that blob of JSON, rather than the HTML itself.
>and you shouldn't be trusting me any more than any other random internet person.
Because I did trust you more than most other random internet persons, I trusted your javascript by temporarily whitelisting it in my browser to view the contents. :)
Are you able to adjust the url of the HN submission? I don't feel like creating one just to check, but that makes it easy to link the discussion, create the discussion before the page.
No, but that would be the way to do it, for sure. There's a link to "edit" my submission, but the form it takes me to only allows updating the title. It displays the URL as ordinary text, rather than a text input.
The web-pages are recognized by their hash, so you can't change them after you setup the page. The problem is that you would have to supply HN with the link to the page, and then go back and edit the page to include the link to the new HN article. Since you can't go back and edit the page, you can't include the link.
the hash is a hash of the content itself, so the link is immutable - if you changed the content the link wouldn't be pointing to the same thing anymore.
Would a page be able to leave and come back into the network by having a computer that's offline but viewing the page and then comes back? i.e. is there anything to prevent it?
im not 100% sure how this all fits together, but you should be able to recreate a page, so long as the content was exactly the same, and therefor produced the same hash.
So you could set it up to automatically recreate the page when your machine comes back online.
Not quite but I assumed similarly. I'm wondering if it'll connect to the peers automatically after coming back online if just left in the browser. If it did then it'd be possible to keep a page unpublished except when asked for through another channel. It's definitely a really interesting idea to play with. It makes me wonder if it could be used to create a nicer interface for things like Freenet by doing more clientside these days.
I've actually seen people successfully get a tweet to link to itself through trial and error. You can try and guess the IDs and you will eventually get it right.
Send the link with the hash using your gmail account, stay tuned at the corresponding page on http://ephemeralp2p.durazo.us, watch “Currently viewing” pop to "2" immediately after you sent the mail. Obviously, the Google borg is slurping up each and every address on the Web it is fed.
more insidious theories aside, this is probably at least used to drive the "warning, this looks like a phishing attack" which gets inserted inline in some emails.
Then said link is in violation of web standards that have existed for literally decades. I believe I've heard stories about google bots deleting entire forums, because the forums performed destructive actions via GET calls to specific URLs.
If what you said makes sense to you, ask yourself how google can crawl any URL at all, considering communicating anything to any server could trigger a destructive action.
Beware, one of the unwritten laws of HN is that any post containing anything vaguely reminiscent of reddit gets downvoted. Upvoted in order to pre-empt the inevitable downvote.
They do and they are (at least where I have worked) but it seems that Google et. alia are smart enough to not follow links marked containing text with the value "Unsubscribe".
POSTs are not by definition idempotent. You can make a server response to a POST be idempotent but when you want multiple identical requests to have different effects, POST is the method to use (vs. GET, PUT, DELETE, etc.)
Implementors should be aware that the software represents the user in their interactions over the Internet, and should be careful to allow the user to be aware of any actions they might take which may have an unexpected significance to themselves or others.
In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered "safe". This allows user agents to represent other methods, such as POST, PUT and DELETE, in a special way, so that the user is made aware of the fact that a possibly unsafe action is being requested.
Naturally, it is not possible to ensure that the server does not generate side-effects as a result of performing a GET request; in fact, some dynamic resources consider that a feature. The important distinction here is that the user did not request the side-effects, so therefore cannot be held accountable for them.
I think the concept is akin to a beach ball that is passed from hand to hand. It only exists one instance of it. With a webserver, a new instance is created for each user viewing the page.
I once implemented a crude but serverless P2P in JavaScript by using pulse-width-modulating information inside bandwidth usage; anyone on the same Wi-Fi frequency would then be able to stream some arbitrary fixed-bitrate content and observe the drops in their bandwidth throughput to pick up the modulated information. But it was hellishly slow (a couple of bytes a second at best, a couple of bits at worst), and didn't work as soon as there were 3 devices on the frequency.
Would be interested to explore other ways to make pure-JS mesh networking happen. I've thought of using the JavaScript image pinging trick to "port scan" the entire subnet for peers, and communicate by "mildly DDoSing" each other with pings, but haven't actually tried yet.
Another potential method to pure-JS P2P is by using the microphone and speaker and building a mesh network at inaudible frequencies but this requires microphone permissions and won't work beyond the same room, and would be only several hundred bytes per second at best.
Feross has built the thing the author talks about, a Bittorrent client in the browser called WebTorrent, it is super awesome and incredibly useful for sending files, check it out at: http://instant.io !
This is a fantastic example of using Phoenix channels! For those interested in how channels work, check out my ElixirConfEU Phoenix keynote where I give an overview:
Very nice idea and I'm sure that it's a great way to play with Elixir/a platform. Well done!
Two questions:
1)
You talk about P2P uses, but would it be feasible to 'seed' a true P2P net? A site that delivers the application, and everything from there is P2P vs. 'ask the server so that the server asks potential peers (clients?)'?
2)
I got this at the top:
Connecting to websocket.
Connected!
Listening for content for hash 2bbbf21959178ef2f935e90fc60e5b6e368d27514fe305ca7dcecc32c0134838
Requesting content.
Received content for hash 2bbbf21959178ef2f935e90fc60e5b6e368d27514fe305ca7dcecc32c0134838
Received content for hash 2bbbf21959178ef2f935e90fc60e5b6e368d27514fe305ca7dcecc32c0134838
Standing by... ready to share this content!
Two answers for the hash? Intentional? Fine? If that happens quite a lot, you might waste more bandwidth/resources than necessary in your experiment?
For (1) I'm not sure. I'm primarily a backend dev, so this was partly an experiment for me to play with the front end, too. I kind of hope javascript is not able to connect "sideways" to other ordinary browsers. I would think they'd need to be running some sort of server, which I don't think the browser can get going. Would be happy to learn more about this from someone more knowledgeable.
Regarding (2), you should have seen my first approach: Every new client broadcasted a request for the content, and everyone with it responded! Now that was a waste of bandwidth.
But what I've done here is I send the request for content out to everyone with it, with a probability of 1/N, where N is roughly the number of people with it. So in your case, it looks like it got sent to two folks. Sometimes it gets sent to none, in which case the client will retry in 2 seconds.
It was a little tricky to figure out since phoenix runs every socket connection in its own erlang process (great for robustness and concurrency, but a real mind bender if you're not used to it). So this probabilistic approach was the best I could come up with, instead of having some master process select the right person to send the request to.
True, and a very clear way of putting it. Interestingly, since Erlang is so well-suited for distributed computing, and Phoenix (the web framework I'm using) has been built to take advantage of that, it wouldn't be too hard to let someone else spin up this same service and take part in distributing the content.
Off the top of my head, the only thing that wouldn't work is the "Currently Viewing" counter, which relies on this all running on a single heroku dyno. Otherwise, the socket messages are routed over a distributed PubSub layer, which should be pretty easy to tap into.
It's a cool idea. What do you think about running it from an Android? (Isn't there an Apache APK?) So you can only view the page while connected to the phone's hotspot?
This is very cool, and the content-addressable web is going to be a huge leap forward for permanence and publishing without needing to run a server. Lots of people are working towards this future. In particular, you should look into IPFS.
Kind of related, http://channelurl.com/ -- the site only has those pages that are linked to. In other words, the content of a page is taken from its URL.
I think this is great. The page is basically a markdown viewer and the link has markdown encoded in it. If you pass the URL through a URL shortener, then the URL shortener is effectively hosting the content of your page. It would be cool if this was static HTML and the markdown was rendered in JS.
I am getting : Currently viewing : 0, Connecting to websocket.. and nothing happens. does this mean the trail has been lost and even the last user have closed their browser?
Very cool! Interesting to see your findings. Maybe the browser is capable already of simplifying p2p file sharing. It would be good to get rid of todays torrent clients and have your browser run some slick p2p service instead that also serves as a convenient way of sharing or sending files with collegues and friends. I also really love the idea of tossing web content around (between) peers like that. Good work, so far. Don't think you're done yet.
Hm, it's working for me. Did you try letting it sit a moment? The server doesn't actually store the content, so it has to load a blank page first. Eventually, you should receive the actual page content from a peer (via the server, to verify that the content is correct).
If you ditched the hash that checks the content and made the whole page 'content-editable' (maybe with something that stripped certain tags like images and video for obvious reasons), that might be fun. :)
Ha, that's a fun idea. I imagine it would regress to a lot of inappropriate ASCII art pretty quickly, though...
The SHA-256 checksum is actually part of what makes it very interesting to me, though. Since the content of the page is guaranteed by the location of it, it's kind of a shared web that anyone can help host.
Very cool stuff. I wonder if it would be possible to host an entire webapp with sockets/rtc. I don't see why not, but it you'd have to re-invent decades worth of architecture built around http.
You should take a look at WebRTC's DataChannel.
We're using it at Streamroot to do P2P video streaming in the browser.
You're talking about filesharing in P2P, again, totally something you can do with WebRTC. You should take a look at PeerJS if you want to experiment in no time.
You also talked about BitTorrent in the browser: you should definitely take a look at WebTorrent
It appears the content does travel to and through the server, for both initial hashing and later relaying to other clients.
Couldn't this be done without ever sending any content to the server, via in-browser hashing and client-to-client connections? (The server could still be used to help clients discover each other.)
An attempt (copy paste really) to try and keep the window open and the content alive. Also, a request for the root/homepage could be funny if it was meta [0].
Apparently it simply try to open a bunch of modal window to close. Chrome is smarter enough to figure that out, and there's only about five windows before the script is killed.
So cool! This is almost BitTorrent, but the dependency on the server for client coordination is still a weak point, and makes it more like Napster than BT.
It would be an interesting project to try something similar with WebRTC, to allow (as I understand it) actual P2P communication.
As the page is content-addressed, it can only change the content by changing the address, thus creating a whole new webpage - address and content, after all, define the resource.
Great proof of concept. I think it would also be interesting to have a version that uses public keys instead of/in addition to the checksum for people to publish content that they can edit and it be signed.
Can't help but comment that the title reminds me of a Doctor Who episode "Blink" where the statues only exist if someone looks at it. Anyway, cool Elixir project!
Interesting, this is like a P2P Snapchat for text. Or no. Better said, I am having trouble trying to associate this with something that exists already.
For the very unlikely case someone can't see this:
This page exists only if someone is looking at it
Hi! Welcome to Ephemeral P2P. Thank you for loading this content. Your browser retrieved it from the browser of someone currently viewing this page. You're now a part of the network and someone who loads this page in the future may get it from you!
The server does not store this content anywhere, so as soon as the last person closes their browser, it's gone. You can see a count of how "healthy" the page is (how many people are viewing the content) at the top.
How does it work?
At a high level, this is what happens:
From the homepage you enter the content you want to share.
When you submit it, you register the SHA-256 hash of the content on the server.
Your browser stands by with an open websocket to the server.
When someone else visits a link "/[sha256hash]", the server tries to retrieve the content from anyone registered with that hash. The server double checks the hash of the content it gets and then passes it along to the new person.
That new person now registers with the server as someone who knows the content for that hash.
But why?
Just a simple experiment to play with websockets and concurrency.
The app is built in Elixir (compiles to erlang) with the Phoenix framework, since it supports websockets out of the box. It's very "railsy" and in addition to rails-style "controllers", it has "channels" which are like controllers for websockets. Made building this thing a snap.
The app is hosted on a heroku 1X dyno and I'm hoping this hits the front page of HN to see how many concurrent connections I can squeeze out of it. Erlang is known for its concurrency, so I'd love to know how Elixir/Phoenix can serve as an alternative to my usual rails when highly concurrent solutions are needed. I plan to tweet my findings, so you can follow me (@losvedir) if you're interested in them.
Where do we go from here?
There are two aspects to this project that I've found quite interesting, that I hope people explore:
Peer-to-peer over browser websockets
Does something like this exist? I opted for P2P of HTML injected into a container div, since I didn't want to deal with the legalities of clients sharing binary files back and forth. But someone wanting to deal with DMCA and all that might have an interesting service here.
I could see this being a great alternative to something like sendfile (I think that's a thing?), or DropBox, or what have you, when you just want to send a file to a friend and it's too big for email. Big files could even be broken up into individual SHA-256'ed pieces, and the list of SHA-256 hashes could be the thing sent. The other side would then fetch each piece in turn and re-assemble.
But that's starting to sound kind of like BitTorrent... I wonder if someone could even make a web-based bittorrent client along these lines.
Content addressed web
The cool thing about the page content being represented by its SHA-256 hash is that it doesn't matter where the content comes from. If anyone sends you the content, you can verify that it's what you were looking for. This makes it well suited for peer-to-peer or otherwise distributed file serving.
Imagine essentially an archival service where all kinds of content (web pages, mp3s, videos, etc) are indexed according to their SHA-256. Hopefully this content would be distributed around the world and backed up and all that good stuff. Then if someone tweets a "hey, checkout this video I made [a2b89..]", it can be retrieved from this "global store of all content" using that hash. It's already very common to mention the SHA-256 alongside a download. Just think if you could take that and download from this service.
Wikipedia is an amazing collection of knowledge in terms of articles. It seems like it would be valuable to have a similar nonprofit service that was a repository of "notable" files.
A quick warning
I don't do any sanitization of the shared HTML content, so be wary of other links that folks may post. But I don't think it's too great of a security risk, since there's nothing private here (no https), and you shouldn't be trusting me any more than any other random internet person.
In closing...
Thanks for checking this out! Feel free to fork the repo on github and play around with it yourself!
And a big thanks to the friendly folks on the #elixir-lang IRC channel who have been very helpful in building this.
A (more) permament version of this content can be found here.
Author here. I haven't done too much with Elixir yet, other than this Phoenix app, but I am totally enamored of it from what I've seen.
First and foremost, the community is absolutely the friendliest, most helpful group of folks.
I'm happy with its performance. Phoenix feels similar to rails to me (I'm primarily a rails dev), but with easily 10X performance. And the concurrency. Oh, the concurrency. This is Erlang's bread and butter, and Elixir and Phoenix are built to take complete advantage of this. I love that every request gets its own Erlang process, and you don't have to worry about it blocking anything else. The Erlang scheduler will keep it fair, cutting off any long running computation if it has to, to keep the latency down on new requests.
I really like how interesting and mind bending Elixir's purely functional, process-oriented approach is. Nothing (well, very little) is mutable! You can sort of simulate mutation by spinning up a little infinitely looping tail-recursive process and sending messages to it. I encourage you to go through the excellent Elixir Getting Started guide [0], and in particular the "Processes" section for more on this.
But I think the thing I like most about it so far, is the introduction to battle-tested Erlang approaches to robustness (OTP). This is a set of abstractions and libraries that have been iterated on over the years and are a fantastic way of building an app that's resilient to failure. Elixir, as it does, takes these abstractions and libraries, and puts enough sugar and consistency over them to make them a joy to use. I find this approach [1] supremely elegant (it's erlang, but applicable to elixir).
Elixir is amazing. It's homiconic, like LISP, so being able to execute code at compile time can have a game-changing impact on reducing boilerplate.
And in addiction to that, it's functional, so I can get the awesome benefits of immutable programming, but without the dizzying complexity of, for example, Scala's type system. And it's based on the Erlang VM, so I get the benefits of battle-hardened concurrency and IO underpinnings.
Previously I stayed away from things like Go and Node/Express, because I felt like it wasn't as easy for the type of work I do as Django, and was missing key things like the admin CRUD interface. But when I looked at what's in the Phoenix framework[1] and read the source, it started to feel like I'm witnessing the dawn of the next Ruby on Rails revolution.
http://www.phoenixframework.org/
The Phoenix framework is really well thought out that making CRUD applications is easy and has all the bells and whistles you'd expect from a framework like Rails. But obviously that's just the surface. You have access to OTP with erlang's processes, supervisors, ETS, etc. You have access to every erlang module that's been written. You also have access to Elixir modules that load in to a Mix project like ruby gems. Also, it's wicked fast and scales.
But maybe best is the documentation is great. From elixir's docs to Phoenix to code docs, everything is clear. The community is quick to respond on IRC or GitHub.
For downsides, just newness: the lack of good blog posts make me feel like I'm re-inventing the wheel sometimes. Additionally, there's not many questions up on StackOverflow so my errors often come up with 0 google hits.
i'm the author of a long-existing indexeddb javascript library. IDB can store all kinds of wonderful objects, including binary blobs. if you're inspired by client side data, you should check out indexeddb!
Ironically that is how most of the websites and webpages exist now - been the norm for couple of years since we started dynamically showing content via AJAX and lazy loading. There is nothing novel here.
You missed the idea here. Its not about AJAX loading. Its about how the content is served to the client without server storing the data to be served (content). You should spend more time on shared link to glean the beauty of this.
Edit: Another amusing anecdote. I've been following this on log2viz (the heroku realtime performance monitoring tool), and was aghast to see high response times and memory usage. Then I realized I was actually looking at the dashboard for one of my rails apps! Phoenix is sitting pretty at about 35MB memory usage, and a median response time of 5ms, so far.