Feed aggregator
Jack Moffitt: Better Ejabberd Vhosts
When Chesspark migrated to ejabberd I explained how we modified ejabberd's authentication and SQL queries to support storing users by their bare JIDs instead of just the local parts (the foo part of foo@example.com). This let's each vhost in ejabberd have a completely separate set of usernames as opposed to make all usernames globally unique in the server.
Since I wrote that post, several people have asked me to share this patch. I finally dug it out of the Chesspark code repository and modified it for use with the recent 2.1.1 version of ejabberd. You can find the patch included with bug EJAB-1131.
Please note that I only modified the queries that were needed for Chesspark. Not all modules' queries were transformed. Also, I didn't make any changes to the MSSQL queries. However, if you look at the patch, it should be extremely obvious how to make further modifications if I missed something that you need. This patch has been running well in production on Chesspark for over a year now, but I have not fully tested this updated version.
Please let the ejabberd team know if you find this functionality useful, and please let me know if you have any questions or if you run into any problems.
Process One: Sea Beyond event summary
This thursday 17th of December, ProcessOne has held an event in Paris, France, around real-time communications. Here are the bits.
As announced and programmed, we held our event "Sea Beyond", on this thursday 17th of december. It took place at Le Before, an art gallery in the center of Paris. Usually, we have snow on the french capital only once a year... And the weather chose that special day, to cover the streets with the white jacket.
We had almost 40 attendants during the whole day, either for the XMPP Sandbox part for developpers, or the Lighthouse part for presentations.
We had a wide range of people, from pure players to operators, including manufacturers and freelancers, and even the press. Let's quote a few, like Nokia, Erlang Consulting, Yoono, af83, Ohm Force, StudiVZ, Meetic, Tandberg, Mozilla Foundation Europe, Orange, INRIA, Yoono, ...
During this real-time communications event, there has been a real sandbox as well as a real (tiny) lighthouse. There was also real pizzas for lunch, as well as real Champagne at the end of the day.
The first part of the day should have been a hacking session, at least this was what we had planned. But people took a lot of interest into workshops about PubSub and the Jingle Nodes. People then talked to each other about real-time technologies, and the future of internet and telecom. We have demoed our products, like the OneTeam client that we hope to release soon, also our supervision console TeamLeader. Mathieu Barcikowski also demonstrated Yoono to some people.
Then we reorganized the room for the presentations. We started with the lightning talks, with Philippe Sultan from INRIA, Asterisk developer and book author, who has presented the Asterisk module for Jingle. Nicolas Vérité from ProcessOne Project Manager, then presented the Talkr.IM XMPP service soon to be launched. Jérôme Sautret, ProcessOne CTO, did a summary of Sea Beyond XMPP sandbox. And Mathieu Barcikowski from Yoono, finished the session by demonstrating to the whole audience this time, the Yoono Firefox extension.
The presentations started with Thiago Camargo from Nimbuzz, who introduced his concept of Jingle Nodes, a network of XMPP peers for relay and relay discovery. This presentation has impressed quite a lot of people: the technology, as well as the ease of the Thiago on these subjects. Christophe Romain from ProcessOne the detailed the PubSub use at BBC, for their LiveText over IP in their radios channels in the browser. The session has been closed by Mickaël Rémond from ProcessOne, who talked and demoed our alpha products implementing the Wave real-time communication system from Google.
Then we had a cocktail to conclude the day, with once again ver interesting and enriching talks.
We believe people enjoyed the all day. We hope they enjoyed it as much as we did!
Thank you to everyone who joined and helped made this event a big success!
Jack Moffitt: JavaScript Testing Taxonomy
Everyone talks a lot about testing, but it seems few people actually do much of it. I certainly am guilty of not doing as much automated testing as I should, but I am working hard to improve. I write a lot of JavaScript, so I've been spending a lot of time experimenting with a range of JavaScript testing tools. There are a lot of different options, and I've come to the conclusion that you need several of them.
xUnit FrameworksThere are tons of xUnit inspired testing frameworks for JavaScript: QUnit, JsUnit, YUI Test, etc. There are probably over a dozen of these, and many of the other testing tools include their own xUnit style assertion framework.
Almost all of these work by having you create an HTML page that serves as the test runner. To run or re-run the tests, you simply open the page in a web browser and off it goes.
The tests that one writes are quite valuable, but the way you run them leaves a lot to be desired. For example, to test in multiple browsers you must hit refresh in all those browsers, and there is no way to run a specific test in isolation easily.
These HTML test runners have one very good quality though; they make it trivial for an outside person to run the tests. All they have to do is open up the HTML page in a browser, and the tests run.
Because of the HTML runner limitations, I think this can't be the only way you run tests. However, because it's so easy to run tests this way, I think it's important to have it, since users can easily use it to run the tests without any effort.
Testing From the Command LineDevelopers want to run tests from their IDEs, run specific tests only, and avoid hitting refresh in lots of open browsers. This is commonly achieved by making the tests runnable from the command line, and there are a number of frameworks which do this.
EnvJS and a few others set up pseudo-browser environments inside stand-alone JavaScript interpreters like Rhino. JsTestDriver captures browsers and provides a way to run tests quickly and easily inside them, without having to interact with them directly.
These types of tools make JavaScript testing very similar to testing in other languages. You can run tests in isolation, automate the running of tests, and run tests extremely quickly.
Unfortunately, these tools have their drawbacks. It often takes some infrastructure to run the tests. For example, the test runner may need Rhino or another JavaScript interpreter. Also, running the tests outside of a true browser environment may not lead to accurate results.
Testing Farms and Integration TestsNone of the previous tools help much with integration tests, and very few of them have any kind of support for controlling browsers. For those tasks there are tools like Windmill and Selenium.
Windmill, for example, can launch a browser, run a test suite, and then shut the browser down at the end. It can also record and play back user interactions, making it a great tool for integration testing as well.
Either tool can be integrated with things like BuildBot to automatically run test suites upon commit, and some Selenium developers even have a startup to run tests in the cloud on a wide variety of browsers and platforms automatically.
The main downside of these tools is that they are slow. It takes time to launch and terminate a browser, and this makes them less than ideal for test driven development where a developer is running the test suite many times.
You Probably Need Them AllI think each of these three kinds of tools has a place in your testing workflow.
HTML test runners are useful for easily running the tests and providing users and external developers a hassle-free way to write and run tests. After all, if it's not dead simple to run and write tests, people won't bother.
Command line driven tools are a necessity for the test driven developer. Tests must be able to run extremely fast since they will run multiple times, and it must be possible to run just the specific tests you want. It takes a little work to set up, but the result is a much more pleasant testing experience. A tool like JsTestDriver can also make it just as easy to test in many browsers at the same time, all controlled from the command line and lightning fast.
Finally, if you want a test farm or to do more system-level tests, tools like Windmill and Selenium are there. Developers might interact directly with JsTestDriver or an HTML runner, but the continuous build systems which run the suite on every commit aren't nearly as sensitive to run speed. Also, starting from a fresh browser every time can be helpful to make the test runs consistent.
Fortunately, all these tools can usually be made to work around a basic xUnit style assertion framework. The tests themselves don't need to be duplicated and remain simple.
My Current ExperimentsI'm currently evaluating YUI Test for the HTML runner and assertion framework. It seems quite well designed, and I really like it's ability to simulate common user interactions for unit testing UI controls. That kind of attention to detail is missing from most of the other, similar frameworks like QUnit.
I hate running tests in the browser, however pretty the HTML, so I'm using JsTestDriver from the command line. I can easily capture Safari, Chrome, and Firefox at the beginning of the day and run tests against them all simultaneously as I'm writing code. With a little bit of configuration, I have got it using YUI Test instead of its built-in assertion framework.
I still need to set up a test farm so that I can run these tests on the full range of platforms and browsers, and I'm thinking about modifying Windmill to use YUI Test in order to facilitate this.
I haven't even started evaluated the various mocking frameworks that exist. Specifically for Strophe, I'll need to mock XMLHttpRequest interactions in order to test anything interesting. I also haven't looked at code coverage reporting, but several frameworks I mentioned have some support for this.
What About You?I'd love to hear your experiences with JavaScript testing. Please let me know what tools you've found useful and how they've worked out for you.
Process One: Real-time web? Real-time search? No, it is WAR!
Recently, there has been a buzz around real time searching. Start-ups like Collecta and OneRiot, together with big names like Google, Yahoo! and Bing, have added a real time feature to their search engines. As a result, the phrase "real time search engine" has emerged and has been reused many times by the media.
The problem is that this does not mean anything! Real time searches do not exist. You search in a mass of existing data, generated in the past. The search engines, based on different relevant algorithms, will return a sorted list of results matching your request. But these results relate to information created in the past, not the present.
What real time search start-ups do is "real time filtering". They plug into various streams of data from one end. They use your query term to filter this incoming stream and finally add new matching content to your result page. No doubt, this is filtering a stream of new data, this time the present, not the past.
Why does it matter? Because real time search and real time filtering do not compete but complement each other. You cannot say real time filtering start-ups compete with Google. They provide a different service satisfying different needs, which is to get instant and fresh data, filtered out of massive streams.
Google understands this. They recently decided to add real-time news results to their search results page, when configured with the right options. Basically, you switch to a mode where the latest results are relevant to you personally. That way, new real-time sources matching your current search will get through the filter to update your search results in real-time.
Where does this lead? What Google proposes is only the beginning. It is done in a robust but still awkward way. The organisation is moving toward a deep integration between the history (existing data, the past) and real time content (new data, the present), to help you make accurate decisions based on the latest events. That's information-augmented reality. That's web-augmented reality.
At ProcessOne, we call this Web Augmented Reality or WAR. This is a carefully crafted blend of historical data with new relevant data. You have the reality (what has happened), and this is enriched with new sources of information, updated and filtered in real time.
As defined by Wikipedia, Augmented Reality (AR) is a term for a live direct or indirect view of a physical real-world environment, whose elements are merged with (or augmented by) virtual computer-generated imagery, creating a mixed reality.
Web Augmented Reality is the same concept, but applied to data as the initial information source, instead of imagery. This data is then supplemented with new sources of information which are filtered and updated, an finally added to the base of existing data, with an overlay.
Web, Web 2.0, real-time web... These are just different stages of evolution that are masking the true underlying movement which is WAR and the browser playing an increasingly important role in communications. This evolution will be of real benefit to the user who will be able to communicate and collaborate in ‘true’ real-time. What this will mean and what it will enable will begin to become clear in 2010.
Would you like to get involved in further discussion about the real time web and collaboration? This topic will be discussed in Paris on December17th, in the first ProcessOne "Sea Beyond" event! We will demo labs products implementing the concept of WAR aka Web Augmented Reality.
Jack Moffitt: Make Some Noise: Subtractive Synthesis in Five Minutes
One of my many passions is music. I play keyboards in the band Lousy Robot, and I spend a lot of time fooling around with synthesizers and music technology in general. My second Ignite NM presentation was an attempt to show everyone that they can have some musical fun with subtractive synthesis:
Jack Moffitt: Real-Time Search in Five Minutes
Earlier this year I gave my first Ignite presentation at Ignite New Mexico. Watch below as I explain the what and why of real-time search in five minutes:
Process One: Sea Beyond event programme
We have published the final and up to date programme of our "Sea Beyond" event, including lightning talks.
Tomorrow will take place the first SeaBeyond event, by ProcessOne, on real-time communications.
The very first SeaBeyond event will be cut in two parts.
In the first part of the day, the XMPP Sandbox, developpers will enjoy experimenting new features.
In the second part of the day, the Lighthouse, there will be presentations on new an upcoming technologies.
First, lightning talks will take place: these are 10 minutes talks, briefly presenting products:
- Philippe Sultan, Asterisk developer and book author, Asterisk module for Jingle.
- Nicolas Vérité, ProcessOne Project Manager, Talkr.IM XMPP service.
- Jérôme Sautret, ProcessOne CTO, Result of Sea Beyond XMPP sandbox coding day presentation.
- Mickaël Rémond, ProcessOne Founder, Realtime web? Realtime search? No, it is WAR!
Then, the presentations will introduce new technologies:
- Jingle protocol and supernodes: A Skype-like standard for Voice Over IP for XMPP, but with a twist. Thiago Camargo from Nimbuzz will present his work.
- Rich Presence and event distributions with PEP and pubsub: The BBC case study, or LiveText over IP, by Christophe Romain from ProcessOne
- Google Wave protocol and implementation for real time multi users communication channels, by Mickaël Rémond from ProcessOne
Jack Moffitt: A New Generation of Search Tools
Most search functionality is powered by full-text indices. These indices store information about which documents contain which terms. The technology involved is quite old and simple to understand, but the implementation details are a minefield of complexity. Unfortunately, most implementations were built to solve a specific class of problems and are having trouble meeting the demanding needs of new applications. A new generation of indexing systems is here, heralded by Basho's Riak Search.
Software like Lucene, Solr, and Sphinx are optimized for class of problems that doesn't always match up to an application's requirements. They work very well when the document set they index doesn't change very much or very often. Adding new documents to the system is expensive. They have highly sophisticated on-disk data structures that provide high query performance at the cost of expensive modification.
For example, Sphinx has limited support for adding new documents to a running system. From their own documentation:
There's a frequent situation when the total dataset is too big to be reindexed from scratch often, but the amount of new records is rather small. Example: a forum with a 1,000,000 archived posts, but only 1,000 new posts per day.
A typical use of Sphinx is to re-index all the content every day. It can re-index very, very fast, but this means that the latency of a new document being available for searching is a day. This may work for many applications, but most users will expect new content to be searchable immediately.
Solr has sophisticated replication and index merging, but it doesn't fare very well either for high volume updates. There is a significant lag between the time a document is sent to the indexer and the time the index reflects this new content. Making this lag small has serious performance consequences, and I know of few sites who can manage under 5 minutes; most probably do merges hourly, daily, etc.
The more data you have, the worse these properties become. The bigger your index, the longer it will take to merge. This leads many applications to shard their search indexes in the same way that high-traffic sites shard their databases. Solr has some limited support for this built-in, but for the most part, it is a complex task left up to the application developer.
This is not to say that these projects are poor; on the contrary, they are quite good at what they are made for, and contain many components that are generally useful. Many high-profile sites use them to great success. However, for applications with high volumes of new content, low latency requirements, or massive amounts of data, these tools are a poor fit.
The world needs something better.
The problems I outline above with traditional indexing systems are problems that I face every day at Collecta. We have massive volumes of new data rocketing into our system, and we keep all this data on hand for quite a while. In order to launch quickly and begin iterating, there wasn't time to invent a new indexing system. We knew the limitations of what we had and planned accordingly, bootstrapping technologically as opposed to financially.
Indexing at Collecta is there to provide historical context; it is only part of our search technology. Our real-time technology is powered by streaming (using XMPP), not indices. As far as I'm aware, we are the only real-time company that is not based on indexing systems, and that fact probably explains why our search results appear in fractions of a second after, not minutes after, content is published.
Our users would not be impressed if the results page showed them nothing until the next time someone mentioned their query term, so we store a historical archive and index it for queries. Unfortunately, the traditional indexing systems fall well short of our needs.
We need something better.
I've been following the NoSQL community for some time. Collecta was an early adopter of CouchDB and an early customer of Cloudant. Earlier this year, I began to plan for the development of a new indexing system to better meet our needs. That's about the time I first met Justin Sheehy of Basho.
I had begun to think about using a Dynamo-like system as a basis for a new index. It had a lot of desireable properties that I wanted. Specifically, trading off consistency for search indexes makes the most sense. We already have to live with this trade-off with the other systems which are replicated, even though those systems were not designed upfront with that trade-off in mind.
When I first heard of Riak, I called Justin to learn more. In the course of our conversation, I shared my ideas about a new indexing system, and I asked if such a thing could be built on top of Riak. We had a fruitful conversation and came up with a lot of interesting ideas.
Several weeks later, Justin told me that John Muellerleile was leading a team at Basho that had started experimentally building such a system, and it was showing some promise. What followed was a lot of back and forth on requirements, ideas, and demos over a period of a few months. Each step of the way, we were all growing excited about the possibilities.
The end result of this effort is Riak Search, a new Basho product that should mark the next era of indexing system design. Collecta has gotten a solution to several large pain points, not the least of which is that Riak Search will save us a lot of money and time. Basho has gotten a new product and a new customer.
It gets even better! They even plan to open source a lot of it, a move which we greatly encourage.
I think this technology will work its way in everywhere, and users will get even better search experiences. Developers and administrators will be pleased to stop fighting against the grain of the current solutions and embrace the right tool for the job.
Jack Moffitt: Fastest XMPP Sessions with HTTP Pre-Binding
XMPP session establishment is normally quite fast over native sockets, but over BOSH, the round-trip latency of several request and response pairs can be quite high. Among its other benefits, session attachment provides a great solution to this problem.
I first heard of this technique for rapid session bootstrapping, called pre-binding, bootstrapping from Andy Skelton. Andy had written a module for im.wordpress.com that in a single HTTP request created and provided the credentials Strophe needs to attach to an existing session. This session was already authenticated, meaning that the first request Strophe sends can be a real, application-level request.
For example, a normal BOSH session must connect and authenticate to an XMPP server. This typically takes about 4 to 5 stanzas, each of which must wait for a response before proceeding. Assuming a 100 millisecond round-trip time, this is about half a second of latency. While not much, this is directly perceptible to users. A pre-bound requests however, involves only one round-trip, turning half a second into a small 100 milliseconds.
Andy has kindly made his http_prebind module available on his GitHub account, if you'd like to try it out.
Collecta uses this same technique, however we use it with anonymous sessions. Any BOSH client can connect to http://collecta.com/http-pre-bind and instantly receive credentials for an anonymous session to guest.collecta.com. This makes our own client's start-up time near instantaneous. In fact, we've gone even further by having the web application server make this request before the HTML is returned and embed the credentials into the page; an anonymous session is ready and waiting as soon as the JavaScript code starts executing.
Today, we've made our own Mod-Http-Pre-Bind code available, and I've written a small example for using this with Strophe.js. I'll go over the salient parts of this example in the rest of this post.
Instead of connecting normally, an AJAX request is made to the pre-bind service. I've used jQuery for the code below, but any AJAX library will work just as well:
// attempt prebind $.ajax({ type: 'POST', url: PREBIND_SERVICE, contentType: 'text/xml', processData: false, data: $build('body', { to: Strophe.getDomainFromJid($('#jid').val()), rid: '' + Math.floor(Math.random() * 4294967295), wait: '60', hold: '1'}).toString(), dataType: 'xml', error: normal_connect, success: attach});The code sends an HTTP POST request containing a BOSH-like <body/> element with the initial RID, the domain to connect to, and the hold and wait values.
The pre-bind service will return a <body/> tag in its response like the one shown here:
<body xmlns='http://jabber.org/protocol/httpbind' sid='892efca20cea238958f0603f89a6f8472ef790fe' rid='2219367495'> <iq xmlns='jabber:client' id='_bind_auth_2' type='result'> <bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'> <jid>22720631691260831658412599@guest.collecta.com/37436661951260831658614586</jid> </bind> </iq> </body>You can simply extract the JID, SID, and RID values straight out of this response, and use them to call attach():
function attach(data) { log('Prebind succeeded. Attaching...'); connection = new Strophe.Connection(BOSH_SERVICE); connection.rawInput = rawInput; connection.rawOutput = rawOutput; var $body = $(data.documentElement); connection.attach($body.find('jid').text(), $body.attr('sid'), parseInt($body.attr('rid'), 10) + 1, onConnect); }Now you have an established, anonymous session to the server, ready for immediate use.
It just doesn't get much easier or faster than that!
There's a whole chapter of my book on the topic of session attachment and its various uses. It's starting to become an important technique for XMPP web applications.
Thiago Rocha Camargo: Why Skype UI OpenSource?
When legacy proprietary companies start all of a sudden OpenSourcing their software, a lot of questions are raised. Of course they have their own explanations for the facts, although frequently are not the whole true about the fact.
Since Skype announced their OpenSource UI, I received some email asking why would that happen and what are the reasons behind the curtains.
Here is a personal explanation based on some facts:
Background Story:
- Skype started their business in a market almost empty of competition. VoIP indeed existed, but was NOT directed to end users, neither mass market.
- OpenSource would not make sense as it usually (for big companies) takes way more time and money to build solutions on top of standards then creating your own black box with the solutions of your own problems and challenges inside. And the lack of demand for a standard format was about 99%.
- Another reason is the "secret" factor, as the first ones, they have to create all the buzz around their holly grail and of course ensure the domination.
- That is why(among other reasons) Skype made their business on top of proprietary routing algorithms and proprietary protocols and encryption.
- In the other hand for very small VoIP services the picture is different as they started on top of OpenSource platforms, even in a time that most of them were experimental. The reason is, the target market was massively smaller than Skype targets. And of course the money to invest was way smaller as well.
- SIP adoption was adopted in almost 99% of the rising small providers that started popping up everywhere in the world.
- Interesting fact is that most pioneer small VoIP providers started with virtual PBX system, not even SIP routers and big platforms. Basically "grow on demand" strategy was adopted as well.
- VoIP is way more spread in the world since Skype started the massification. We have large number of companies building equipments, client soft-phones, VoIP platforms etc...
- We have a ridiculous huge number of VoIP providers which anther ridiculous range of different prices and quality.
- SIP is the main driver for major market. It is used on landlines in Europe and US, it is massively used in company PBX and telephony of big companies. Which are the main source of PAYED VoIP services.
- Skype is smaller in amount of users and revenue, if compared to all SIP Providers together.
- Google acquires Gizmo5 a medium VoIP provider fully operated over SIP. Yes it has direct relation.
- Skype ubiquity is restricted by their Desktop Client or their very limited IPhone client.
- Skype still growing the amount of Desktop users. But the amount of paying users is still and tends to get lower.
- Skype interoperability is null, as it is based on proprietary system and specifications.
- Competition makes more money and the migration from old Skype users to new VoIP alternatives is big. As educated users knows now how to change and why to change to cheaper and flexible services.
- We have more SIP enabled equipment and computers than Desktops with Skype client installed.
- OpenSource their SILK codec as already mentioned in this blog.
- Open SIP Gateway for Business Users. In order to start competing in a very profitable market slice.
- Bought the patent of their VoIP routing solutions. In order to prepare for an interoperability round.
- Aim for alternative and relatively "virgin" markets like mobile, Linux, netbooks etc:
- Release more usable mobile clients.
- OpenSource as much as possible for Linux as current client is really crap and also Linux users in general don't use closed source applications.
Skype wants its ubiquity and revenue back, and for that they are investing huge resources in Openness and Standardization, as it is what their market needs and demands nowadays.
In a mash-able hyper-connected world, who is closed and not interoperable will fade.



Recent comments
9 weeks 4 days ago
12 weeks 12 hours ago
12 weeks 3 days ago
13 weeks 1 day ago
14 weeks 4 days ago
14 weeks 4 days ago
14 weeks 6 days ago
19 weeks 3 days ago
25 weeks 6 days ago
26 weeks 3 days ago