Randomly generated names and language data. Languages are chosen from a pool, and a random number of projects are chosen, each with an exponential bias (^4) towards one side of the pool.