Requirements for a search engine - H/W + Software + Money ???

0
1

so, what do you think one needs to have inorder to be able have and support a personal search engine.

here's some of the things i think one needs - what additional stuff does one need - please comment

  1. Website/Domain - with all the SOPA, PIPA stuff and new laws coming
    in wonder; if ".com" or ".org" or ".net" is good enough for a search
    engine that's going to crawl websites and display some innovative
    stuff.
  2. HTML5 - for the frontend - or whatever - Please let everyone know if
    you think of any other better alternative
  3. Python - for the backend - and probably other languages to support
    server programming - and other stuff.
  4. Hardware - do you think its good enough to have the search-engine
    hosted on some cloud hosting company - like Amazon - or - have one's
    own cluster of machines and take care of the cluster yourself.
    • NOTE: as data increases disk-space will increase exponentially.
    • What's more easy to take care of - a Cloud or your own Network.
    • Whats more cheap - Cloud or your own Network (cluster)

5. what else ???


Some Questions:

  1. what kind of filesystem would you use to store such huge amounts of
    data ? obviously has to be some kind of cluster filesystem - so that
    you can support data availability
  2. what or which database would you use to store such huge data
  3. would you use open source software or buy software outright - what are the
    implications of choosing one of the either ?
  4. How much money would be required to do all of the stuff mentioned
    above ? - just some idea - so that everyone knows what to do for money
  5. how many developers would be required - whats the minimum number of
    developers required to get a sample search engine ready for DEMO
    • lets say you have 1 year to get the DEMO ready
    • do you think 1 person can do it all

More questions will be added later

Cheers :-))

asked 10 Apr '12, 16:11

Dhiraj's gravatar image

Dhiraj
1.8k51220
accept rate: 0%

Any particular reason you want to gather requirements for a search engine?

(10 Apr '12, 17:06) Charles Lin Charles%20Lin's gravatar image

@charlzz - well the course name says "Building a search engine" - and now that the course is over - wondering if it is really this simple. Personally i believe that a single person should be able to create a search engine with lots of innovations.

But, then when you look out in the market - you see that there are very few search engines - obviously one factor is the cost of maintenance - otherwise there would have been hundreds of search engines out there.

i am pretty pumped out to try out something on my own - hence before i jump into it i would like to know what one needs to build a search engine - not only that - how can one sustain for a long period considering the costs and hardware issues that one could face.

i just want to know whether its worth trying to build a search engine with new innovative ideas.

cheers :-))

(11 Apr '12, 03:07) Dhiraj Dhiraj's gravatar image

I think, once you start, you'll see it's a big challenge. I think the first thing you really need to think about is what you want the search engine to excel at. It's easy to say "I want to be innovative", but if you have no particular ideas on how to be innovative, then you're stuck right at the start.

(11 Apr '12, 07:27) Charles Lin Charles%20Lin's gravatar image

hi charlzz - yes i do know its a challenge and by "innovative" - i mean implementing ideas that are not in production or market and I do have such few ideas.

But before i start, i would like to know whether its worth to implement a search engine catering to a specific domain or a type. I dont see the software implementation an issue - its the hardware costs and its maintenance - rest can be managed. If money becomes a bottle-neck - then its kind of useless. I would rather build something based on existing search engine - rather than a search engine from scratch.

Its like saying i want to build a space-station; without understanding the investment aspect of it. Even if one has great ideas to build a space-station, it requires huge amount of investment/money to build and maintain it.

And that's what i am trying to get at - whats the overall cost to build a search engine. Software part is ok - its the hardware and its maintenance.

Anyways i think this may be the wrong place to post such a question.

Thanks n Cheers :-))

(11 Apr '12, 08:14) Dhiraj Dhiraj's gravatar image

One Answer:

A site with a .com domain falls under US laws even if the owner is not US based or the site isn't hosted in the US. http://www.zdnet.com.au/com-beholden-to-maryland-law-339332813.htm I wouldn't recommend a .com if you're not in the US.

One person can definitely do it as proven by Gabriel Weinberg, the founder of DuckDuckGo. Some useful links can be found here http://news.ycombinator.com/item?id=3771300 if you are interested in the code.

link

answered 10 Apr '12, 16:25

Chris%20Thompson's gravatar image

Chris Thompson
2.3k31238

Thanks for the HN news link. Will definitely go through the docs/links mentioned there. Will also check out duckduckgo docs.

what are the options for a non-US owner - i don't understand the law stuff in-depth; just have an idea. Will these laws be applicable to other countires sometime in the future - if countries tie-up ???

Thanks once again :-))

(10 Apr '12, 16:34) Dhiraj Dhiraj's gravatar image

You'll have to check which, if any, laws apply to each specific TLD. I only know about .com because it was briefly mentioned in one of these lectures http://cs75.tv/2010/fall/ I think. And there was the recent Megaupload thing. That's about the extent of my knowledge of laws of TLDs. Sorry.

(10 Apr '12, 16:42) Chris Thompson Chris%20Thompson's gravatar image
Your answer
Question text:

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×15,348
×29

Asked: 10 Apr '12, 16:11

Seen: 481 times

Last updated: 11 Apr '12, 08:14