Wastholm.com

Julia is a high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library. The library, largely written in Julia itself, also integrates mature, best-of-breed C and Fortran libraries for linear algebra, random number generation, signal processing, and string processing. In addition, the Julia developer community is contributing a number of external packages through Julia’s built-in package manager at a rapid pace. Julia programs are organized around multiple dispatch; by defining functions and overloading them for different combinations of argument types, which can also be user-defined. For a more in-depth discussion of the rationale and advantages of Julia over other systems, see the following highlights or read the introduction in the online manual.

Tahoe-LAFS is a Free and Open cloud storage system. It distributes your data across multiple servers. Even if some of the servers fail or are taken over by an attacker, the entire filesystem continues to function correctly, including preservation of your privacy and security.

A one-page summary explains the unique properties of this system.

ownCloud gives you universal access to your files through a web interface or WebDAV. It also provides a platform to easily view & sync your contacts, calendars and bookmarks across all your devices and enables basic editing right on the web.

Cloud Foundry, a VMware-led project is the world’s first open Platform as a Service (PaaS) offering. Cloud Foundry provides a platform for building, deploying, and running cloud apps using Spring for Java developers, Rails and Sinatra for Ruby developers, Node.js and other JVM frameworks including Grails.

At BackType, we are heavy users of Hadoop. We use it to run computations on our 30TB datastore of social data. We've even open-sourced some significant projects that are built on top of Hadoop.

Unfortunately, Hadoop has problems. It's sloppily implemented and requires all sorts of arcane knowledge to operate it. We would be the first to try out a replacement for Hadoop if a viable alternative existed. In this post, we'll look at some of the darker aspects of Hadoop.

BrowserCouch is an attempt at an in-browser MapReduce implementation. It's written entirely in JavaScript and intended to work on all browsers, gracefully upgrading when support for better efficiency or feature set is detected.

Not coincidentally, this library is intended to mimic the functionality of CouchDB on the client-side, and may even support integration with CouchDB in the future.

...

To learn how to use BrowserCouch, check out the work-in-progress tutorial.

Disqus, one of the largest Django applications in the world, will explain how they deal with scaling complexities in a small startup.

There are many benefits to keeping a lightweight stack. At Disqus, keeping the stack thin helps us scale Django to reach over 125 million unique visitors a month with just a small team of engineers. Avoiding complicated software packages until needed reduces unnecessary overhead, and has let us stay nimble, and use new capabilities in Django (i.e., database routing) and other software as they arise. The talk will cover key parts of the architecture and development process at Disqus, including databases (relational and non), queues, automated testing, and continuous deployment.

The latest version of the Tribler BitTorrent client (Win, Mac and Linux), released only a few minutes ago, is capable of all the above and many more things that could be described as quite revolutionary. The client combines a ‘zero-server’ approach with features such as instant video streaming, advanced spam control and personalized content channels, all bundled into a single application.

...

Despite the fact that only a few thousand people are using Tribler on a monthly basis, in technological terms it is one of the most advanced clients. People who install the client will notice that there’s a search box at the top of the application, similar to that offered by other clients. However, when one does a search the results don’t come from a central index. Instead, they come from other peers.

GNU parallel is a shell tool for executing jobs in parallel locally or using remote machines. A job is typically a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. If you use xargs today you will find GNU parallel very easy to use as GNU parallel is written to have the same options as xargs. If you write loops in shell, you will find GNU parallel may be able to replace most of the loops and make them run faster by running several jobs in parallel. If you use ppss or pexec you will find GNU parallel will often make the command easier to read.

Redis is an advanced key-value store. It is similar to memcached but the dataset is not volatile, and values can be strings, exactly like in memcached, but also lists, sets, and ordered sets. All this data types can be manipulated with atomic operations to push/pop elements, add/remove elements, perform server side union, intersection, difference between sets, and so forth. Redis supports different kind of sorting abilities.

In order to be very fast but at the same time persistent the whole dataset is taken in memory, and from time to time saved on disc asynchronously (semi persistent mode) or alternatively every change is written into an append only file (fully persistent mode). Redis is able to rebuild the append only file in background when it gets too big.

1–10 (25)   Next >   Last >|