UMBC ebiquity
How YouTube scales MySQL for its large databases

How YouTube scales MySQL for its large databases

Tim Finin, 10:04am 28 December 2007

Like most research labs, we rely on MySQL whenever we need a database. And like most (I’m guessing, here), it’s common to overhear something like the following in our lab — “We really need to replace MySQL with Oracle or DB2 in X so it can handle the load.” But we never get around to it.

Maybe we don’t have to. Check out Scaling MySQL at YouTube, a keynote talk by YouTube DBA Paul Tuckfield at the 2007 MySQL Conference put online by Conversationnetwork.org.

“In mid 2006, YouTube served approximately 100 million videos in a single day. To maintain a website of that scale, one would imagine YouTube has hundreds of DBAs. But in fact, there are just three people that make it all work. Paul Tuckfield, the MySQL DBA at YouTube shares horror stories about scalability at YouTube and how he coped with them to keep the show going everyday, while learning important lessons along the way. … According to him, the three important reasons for YouTube’s scalability are Python, Memcache and MySQL replication, the last having the most impact. Most people think that the answer to scalability is in upgrading hardware and CPU power. Adding CPUs doesn’t work on its own; wisdom is in getting the maximum amount of RAM for the CPU and then fine tuning.” (src)


5 Responses to “How YouTube scales MySQL for its large databases”

  1. Bruce Says:

    Hi,

    have a look at http://highscalability.com/ for more examples of LAMP, JAVA, etc architectures and why you don’t need to go down the DB2, Oracle, etc route to get high load sites off the ground. If Flickr, etc, don’t use them, why should we?

  2. James Says:

    Because you know, the choice is always between the free, feature-poor, fast at ‘select * from table’ dbms and the expensive, scalable, feature-rich dbms. There couldn’t possible be a free, feature-rich, scalable dbms out there that you could use. Of course not.

  3. Aaron Trevena Says:

    @James,

    It’s a case of picking the right tool for the job – I’ve worked on two high availability sites for different clients in different markets this year : Aviation Briefings for airlines, etc and Online Classifieds – one required Postgres, one required MySQL.

    Horses for courses – quite simply – for this kind of task mysql beats postgres in terms of ease of scaling, query caching, etc – when you’re dealing with very high traffic, then you’re better off breaking up and simplifying your schema, etc in order to get the most speed, than using the “more powerful” RDBMS.

  4. Floyd Price Says:

    Its good to see website with this much traffic sticking with MySQL :-)

  5. the N log N » Blog Archive » Choosing a Database Course Says:

    […] form or another. This blog is backed by a MySQL database. Much larger systems like eBay(Oracle), YouTube(MySQL), and Skype(PostgreSQL) each uses a different database for its backend. I believe there is a good […]