thinking about social networks

All off topic discussions go here. Everything from the funny thing your cat did to your favorite tv shows. Non-programming computer questions are ok too.
Post Reply
User avatar
AlfaOmega08
Member
Member
Posts: 226
Joined: Wed Nov 07, 2007 12:15 pm
Location: Italy

thinking about social networks

Post by AlfaOmega08 »

Well, I love in the same way Web and OS Dev. So I jump from one to the other each 3 months. This time I give a look to social networking sites (facebook, myspace, twitter, etc...), and how are they built.
I took in consideration twitter, which is the simplest in those three, and the less used. The idea is really simple: it allows you to answer in 140 chars to the question "What are you doing?", and share the answer with your followers.
I really don't think it would be more complicated to realize such a site than develop an OS. At least until you reach a great number of users.
Twitter infact has a search page, which shows the latest messages containing your keywords, even if the message have been leaved 3 seconds before (so no spider in the messages, but live updating). To create such a service the search engine should look all over the (enormous) db to find the corresponding records. This should require many seconds, not just half.
Another problem cames with the complex structure of facebook, which mantains lots of informations about you, your friends, your groups, your updates, and so on. Probably it is easyer to create, in the db, many tables per user than just one row for you containing all your data.
But in MySQL each table takes three files, and in ext3 you can only have 32.000 file per directory (64.000 in ext4), and you would reach the limit in few time. So I thougth that MySQL is not the way of proceeding. I tryed with a XML file for each user with all his informations. But another time, using this method you cannot search (like twitter) latest messages. Infact, it would take many months to open and parse millions of XML files.
So how do these services provide high speed with no problems?
Please, correct my English...
Motherboard: ASUS Rampage II Extreme
CPU: Core i7 950 @ 3.06 GHz OC at 3.6 GHz
RAM: 4 GB 1600 MHz DDR3
Video: nVidia GeForce 210 GTS... it sucks...
whowhatwhere
Member
Member
Posts: 199
Joined: Sat Jun 28, 2008 6:44 pm

Re: thinking about social networks

Post by whowhatwhere »

AlfaOmega08 wrote:Well, I love in the same way Web and OS Dev. So I jump from one to the other each 3 months. This time I give a look to social networking sites (facebook, myspace, twitter, etc...), and how are they built.
I took in consideration twitter, which is the simplest in those three, and the less used. The idea is really simple: it allows you to answer in 140 chars to the question "What are you doing?", and share the answer with your followers.
I really don't think it would be more complicated to realize such a site than develop an OS. At least until you reach a great number of users.
Twitter infact has a search page, which shows the latest messages containing your keywords, even if the message have been leaved 3 seconds before (so no spider in the messages, but live updating). To create such a service the search engine should look all over the (enormous) db to find the corresponding records. This should require many seconds, not just half.
Another problem cames with the complex structure of facebook, which mantains lots of informations about you, your friends, your groups, your updates, and so on. Probably it is easyer to create, in the db, many tables per user than just one row for you containing all your data.
But in MySQL each table takes three files, and in ext3 you can only have 32.000 file per directory (64.000 in ext4), and you would reach the limit in few time. So I thougth that MySQL is not the way of proceeding. I tryed with a XML file for each user with all his informations. But another time, using this method you cannot search (like twitter) latest messages. Infact, it would take many months to open and parse millions of XML files.
So how do these services provide high speed with no problems?
pcmattman
Member
Member
Posts: 2566
Joined: Sun Jan 14, 2007 9:15 pm
Libera.chat IRC: miselin
Location: Sydney, Australia (I come from a land down under!)
Contact:

Re: thinking about social networks

Post by pcmattman »

Most likely not merely one partition for the databases either ;)
User avatar
AlfaOmega08
Member
Member
Posts: 226
Joined: Wed Nov 07, 2007 12:15 pm
Location: Italy

Re: thinking about social networks

Post by AlfaOmega08 »

Just tried to fill a 5-columns table with 3.600.000 rows of random text. It took about 15 minutes (my old pentium...). A query with the LIKE keyword tooks 4 to 8 seconds. Seems that the solution is called memcached. I'll give it a try
Please, correct my English...
Motherboard: ASUS Rampage II Extreme
CPU: Core i7 950 @ 3.06 GHz OC at 3.6 GHz
RAM: 4 GB 1600 MHz DDR3
Video: nVidia GeForce 210 GTS... it sucks...
pcmattman
Member
Member
Posts: 2566
Joined: Sun Jan 14, 2007 9:15 pm
Libera.chat IRC: miselin
Location: Sydney, Australia (I come from a land down under!)
Contact:

Re: thinking about social networks

Post by pcmattman »

Keep in mind these guys would be working with dedicated hardware, not a desktop PC, for their databases (and files, and web services, probably all behind load balancers). For all we know, they may not even use MySQL.
User avatar
JackScott
Member
Member
Posts: 1031
Joined: Thu Dec 21, 2006 3:03 am
Location: Hobart, Australia
Contact:

Re: thinking about social networks

Post by JackScott »

Also keep in mind that large Web 2.0 apps are using completely different programming frameworks from normal applications. For instance, they might divide and conquere the algorithm. Split the database into 1000 pieces, search over each piece, then collect the results.

I also happen to know that the biggest apps (Google Maps, Facebook, Twitter is moving this way) use functional-style programming a lot. By not having side-effects in their code, their code is massively scalable. Source: this talk I went to.

Hope that all makes sense.

As an aside, if you do bother listening to the talk (mildly interesting), there's a guy at the end who points out the correct date that Lisp was invented. That's me, and it turns out I was wrong, it was 1958 (for the paper).
User avatar
AlfaOmega08
Member
Member
Posts: 226
Joined: Wed Nov 07, 2007 12:15 pm
Location: Italy

Re: thinking about social networks

Post by AlfaOmega08 »

OMG, I've never heard about functional programming before... but it seems a lot complex.
I really don't see any way to rewrite my OS in functional programming C++.
You can only use const variables right?
It would surely increase scalability, but isn't it slower in many situations?
Please, correct my English...
Motherboard: ASUS Rampage II Extreme
CPU: Core i7 950 @ 3.06 GHz OC at 3.6 GHz
RAM: 4 GB 1600 MHz DDR3
Video: nVidia GeForce 210 GTS... it sucks...
whowhatwhere
Member
Member
Posts: 199
Joined: Sat Jun 28, 2008 6:44 pm

Re: thinking about social networks

Post by whowhatwhere »

AlfaOmega08 wrote:OMG, I've never heard about functional programming before... but it seems a lot complex.
I really don't see any way to rewrite my OS in functional programming C++.
You can only use const variables right?
It would surely increase scalability, but isn't it slower in many situations?
MUTUAL EXCLUSION.
Do you see it?
User avatar
steveklabnik
Member
Member
Posts: 72
Joined: Wed Jan 28, 2009 4:30 pm

Re: thinking about social networks

Post by steveklabnik »

AlfaOmega08 wrote:OMG, I've never heard about functional programming before... but it seems a lot complex.
Only if you try to approach it like "regular" programming. My more math-inclined friends took to Haskell immediately, it made quite a bit of sense to them.
I really don't see any way to rewrite my OS in functional programming C++.
You can only use const variables right?
There is no such thing as 'functional programming C++'. So no, you can't.
It would surely increase scalability, but isn't it slower in many situations?
Yes, functional languages, on the whole, tend to be slower than procedural programs. However, Haskell in particular can be as fast as C at times. GHC is an amazing piece of work, I can't stress that enough. However, that's not always the case. It's pretty easy to write slow code.
User avatar
AndrewAPrice
Member
Member
Posts: 2298
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

Re: thinking about social networks

Post by AndrewAPrice »

You're all twits, right?

(pun intended)
My OS is Perception.
Post Reply