The Good The Bad and The Ugly

Online Research Sites and Internet Research

    Deep Web Research Tools

    This list offers some tips and tools to help you get the most out of your Internet searches.

    Semantic Search Tools and Databases

    Semantic search tools depend on replicating the way the human brain thinks and categorizes information to ensure more relevant searches. Give some of these semantic tools and databases a try.
    • Zotero. Firefox users will like this add-on that helps you organize your research material by collecting, managing, and citing any references from Internet research.
    • Freebase. This community-powered database includes information on millions of topics.
    • Powerset. Enter a topic, phrase, or question to find information from Wikipedia with this semantic application.
    • Kartoo. Enter any keyword to receive a visual map of the topics that pertain to your keyword. Hover your mouse over each to get a thumbnail of the website.
    • DBpedia. Another Wikipedia resource, ask complex questions with this semantic program to get results from within Wikipedia.
    • Quintura. Entering your search term will create a cloud of related terms as well as a list of links. Hover over one of the words or phrases in the cloud to get an entirely different list of links.
    • [true knowledge]. Help with current beta testing at this search engine or try their Quiz Bot that finds answers to your questions.
    • Stumpedia. This search engine relies on its users to index, organize, and review information coming from the Internet.
    • Evri. This search engine provides you with highly relevant results from articles, papers, blogs, images, audio, and video on the Internet.
    • Gnod. When you search for books, music, movies and people on this search engine, it remembers your interests and focuses the search results in that direction.
    • Boxxet. Search for what interests you and you will get results from the "best of" news, blogs, videos, photos, and more. Type in your keyword and in addition to the latest news on the topic, you will also receive search results, online collections, and more.

    Meta-Search Engines

    Meta-search engines use the resources of many different search engines to gather the most results possible. Many of these will also eliminate duplicates and classify results to enhance your search experience.
    • SurfWax. This search engine works very well for reaching deep into the web for information.
    • Academic Index. Created by the former chair of Texas Association of School Librarians, this meta-search engine only pulls from databases and resources that are approved by librarians and educators.
    • Infomine has been built by a pool of libraries in the United States.
    • Clusty. Clusty searches through top search engines, then clusters the results so that information that may have been hidden deep in the search results is now readily available.
    • Dogpile. Dogpile searches rely on several top search engines for the results then removes duplicates and strives to present only relevant results.
    • Turbo 10. This meta-search engine is specifically designed to search the deep web for information.
    • Multiple Search. Save yourself the work by using this search engine that looks among major search engines, social networks, flickr, Wikipedia, and many more sites.
    • Mamma. Click on the Power Search option to customize your search experience with this meta-search engine.
    • World Curry Guide. This meta-search tool with a strong European influence has been around since 1997 and is still growing strong.
    • Give this meta-search engine a try. It accesses a large number of databases and claims to have more access to information than Google.
    • Icerocket. Search blogs as well as the general Internet, MySpace, the news, and more to receive results by posting date.
    • iZito. Get results from a variety of major search engines that come to you clustered in groups. You can also receive only US website results or receive results with a more international perspective.
    • Ujiko. This unusual meta-search tool allows for you to customize your searches by eliminating results or tagging some as favorites.
    • IncyWincy is an Invisible Web search engine and it behaves as a meta-search engine by tapping into other search engines and filtering the results. It searches the web, directory, forms, and images. With a free registration, you can track search results with alerts.

    General Search Engines and Databases

    These databases and search engines for databases will provide information from places on the Internet most typical search engines cannot.
    • DeepDyve. One of the newest search engines specifically targeted at exploring the deep web, this one is available after you sign up for a free membership.
    • OAIster. Search for digital items with this tool that provides 12 million resources from over 800 repositories.
    • direct search. Search through all the direct search databases or select a specific one with this tool.
    • CloserLook Search. Search for information on health, drugs and medicine, city guides, company profiles, and Canadian airfares with this customized search engine that specializes in the deep web.
    • Northern Light Search. Find information with the quick search or browse through other search tools here.
    • Yahoo! Search Subscriptions. Use this tool to combine a search on Yahoo! with searches in journals where you have subscriptions such as Wall Street Journal and New England Journal of Medicine.
    • Librarians’ Internet Index (LII) is a publicly-funded website and weekly newsletter serving California, the nation, and the world.
    • The Scout Archives. This database is the culmination of nine years’ worth of compiling the best of the Internet.
    • Daylife. Find news with this site that offers some of the best global news stories along with photos, articles, quotes, and more.
    • Silobreaker. This tool shows how news and people in the news impacts the global culture with current news stories, corresponding maps, graphs of trends, networks of related people or topics, fact sheets, and more.
    • spock. Find anyone on the web who might not normally show up on the surface web through blogs, pictures, social networks, and websites here.
    • The WWW Virtual Library. One of the oldest databases of information available on the web, this site allows you to search by keyword or category.
    • pipl. Specifically designed for searching the deep web for people, this search engine claims to be the most powerful for finding someone.
    • Complete Planet is a free and well designed directory resource makes it easy to access the mass of dynamic databases that are cloaked from a general purpose search.
    • Infoplease is an information portal with a host of features. Using the site, you can tap into a good number of encyclopedias, almanacs, an atlas, and biographies. Infoplease also has a few nice offshoots like for kids and Biosearch, a search engine just for biographies.

    Academic Search Engines and Databases

    The world of academia has many databases not accessible by Google and Yahoo!, so give these databases and search engines a try if you need scholarly information.
    • Google Scholar. Find information among academic journals with this tool.
    • WorldCat. Use this tool to find items in libraries including books, CDs, DVDs, and articles.
    • getCITED. This database of academic journal articles and book chapters also includes a discussion forum.
    • Microsoft Libra. If you are searching for computer science academic research, then Libra will help you find what you need.
    • BASE – Bielefeld Academic Search Engine. This multi-disciplinary search engine focuses on academic research and is available in German, Polish, and Spanish as well as English.
    • yovisto. This search engine is an academic video search tool that provides lectures and more.
    • AJOL – African Journals Online. Search academic research published in AJOL with this search engine.
    • HighWire Press. From Stanford, use this tool to access thousands of peer-reviewed journals and full-text articles.
    • MetaPress. This tool claims to be the "world’s largest scholarly content host" and provides results from journals, books, reference material, and more.
    • OpenJ-Gate. Access over 4500 open journals with this tool that allows you to restrict your search to peer-reviewed journals or professional and industry journals.
    • Directory of Open Access Journals. This journal search tool provides access to over 3700 top "quality controlled" journals.
    • Intute. The resources here are all hand-selected and specifically for education and research purposes.
    • Virtual Learning Resource Center. This tool provides links to thousands of academic research sites to help students at any level find the best information for their Internet research projects.
    • Gateway to 21st Century Skills. This resource for educators is sponsored by the US Department of Education and provides information from a variety of places on the Internet.
    • MagBot. This search engine provides journal and magazine articles on topics relevant to students and their teachers.
    • Michigan eLibrary. Find full-text articles as well as specialized databases available for searching.

    Scientific Search Engines and Databases

    The scientific community keeps many databases that can provide a huge amount of information but may not show up in searches through an ordinary search engine. Check these out to see if you can find what you need to know.
    • This search engine offers specific categories including agriculture and food, biology and nature, Earth and ocean sciences, health and medicine, and more.
    • Search for science information with this connection to international science databases and portals.
    • CiteSeer.IST. This search engine and digital library will help you find information within scientific literature.
    • Scirus has a pure scientific focus. It is a far reaching research engine that can scour journals, scientists’ homepages, courseware, pre-print server material, patents and institutional intranets.
    • Scopus. Find academic information among science, technology, medicine, and social science categories.
    • GoPubMed. Search for biomedical texts with this search engine that accesses PubMed articles.
    • the Gene Ontology. Search the Gene Ontology database for genes, proteins, or Gene Ontology terms.
    • PubFocus. This search engine searches Medline and PubMed for information on articles, authors, and publishing trends.
    • Scitation. Find over one million scientific papers from journals, conferences, magazines, and other sources with this tool.

    Custom Search Engines

    Custom search engines narrow your focus and eliminate quite a bit of the extra information usually contained in search results. Use these resources to find custom search engines or use the specific custom search engines listed below.
    • This listing includes many of the Google custom search engines created.
    • Find custom search engines here or create your own.
    • CSE Links. Use this site to find Google Coop custom search engines.
    • PGIS PPGIS Custom Search. This search engine is customized for those interested in the "practice and science" of PGIS/PPGIS.
    • Files Tube. Search for files in file sharing and uploading sites with this search engine.
    • Rollyo. "Roll your own search engine" at this site where you determine which sites will be included in your searches.

    Collaborative Information and Databases

    One of the oldest forms of information dissemination is word-of-mouth, and the Internet is no different. With the popularity of bookmarking and other collaborative sites, obscure blogs and websites can gain plenty of attention. Follow these sites to see what others are reading.
    • As readers find interesting articles or blog posts, they can tag, save, and share them so that others can enjoy the content as well.
    • Digg. As people read blogs or websites, they can "digg" the ones they like, thus creating a network of user-selected sites on the Internet.
    • Technorati. Not only is this site a blog search engine, but it is also a place for members to vote and share, thus increasing the visibility for blogs.
    • StumbleUpon. As you read information on the Internet, you can Stumble it and give it a thumbs up or down. The more you Stumble, the more closely aligned to your taste will the content become.
    • Reddit. Working similarly to StumbleUpon, Reddit asks you to vote on articles, then customizes content based on your preferences.
    • Twine. With Twine you can search for information as well as share with others and get recommendations from Twine.
    • This collaborative site offers shared knowledge from its members through forums, blogs, and shared websites.

    Hints and Strategies

    Searching the deep web should be done a bit differently, so use these strategies to help you get started on your deep web searching.
    • Don’t rely on old ways of searching. Become aware that approximately 99% of content on the Internet doesn’t show up on typical search engines, so think about other ways of searching.
    • Search for databases. Using any search engine, enter your keyword alongside "database" to find any searchable databases (for example, "running database" or "woodworking database").
    • Get a library card. Many public libraries offer access to research databases for users with an active library card.
    • Stay informed. Reading blogs or other updated guides about Internet searches on a regular basis will ensure you are staying updated with the latest information on Internet searches.
    • Search government databases. There are many government databases available that have plenty of information you may be seeking.
    • Bookmark your databases. Once you find helpful databases, don’t forget to bookmark them so you can always come back to them again.
    • Practice. Just like with other types of research, the more you practice searching the deep web, the better you will become at it.
    • Don’t give up. Researchers agree that most of the information hidden in the deep web is some of the best quality information available.

    Helpful Articles and Resources for Deep Searching

    Take advice from the experts and read these articles, blogs, and other resources that can help you understand the deep web.
    • Deep Web – Wikipedia. Get the basics about the deep web as well as links to some helpful resources with this article.
    • Deep Web – AI3:::Adaptive Information. This assortment of articles from the co-coiner of the phrase "deep web," Michael Bergman offers a look at the current state of deep web perspectives.
    • The Invisible Web. This article from provides a very simple explanation of the deep web and offers suggestions for tackling it.
    • ResourceShelf. Librarians and researchers come together to share their findings on fun, helpful, and sometimes unusual ways to gather information from the web.
    • Docuticker. This blog offers the latest publications from government agencies, NGOs, think tanks, and other similar organizations. Many of these posts are links to databases and research statistics that may not appear so easily on typical web searches.
    • This site offers tips and tools for IT professionals to find the best deep web resources.
    • Digital Image Resources on the Deep Web. This article includes links to many digital image resources that probably won’t show up on typical search engine results.
    • Timeline of events related to the Deep Web. This timeline puts the entire history of the deep web into perspective as well as offers up some helpful links.
    • The Deep Web. Learn terminology, get tips, and think about the future of the deep web with this article.
    • How to Evaluate Web Resources is a guide by to help students quickly evaluate the credibility of any resource they find on the internet

    International Handbook of Internet Research
    Jeremy Hunsinger, ‎Lisbeth Klastrup, ‎Matthew M. Allen - 2010 - ‎Preview - ‎More editions
    This handbook, the first of its kind, is a detailed introduction to the numerous academic perspectives we can apply to the study of the internet as a political, social and communicative phenomenon.

    International Handbook of Internet Research
    Jeremy Hunsinger, ‎Lisbeth Klastrup, ‎Matthew M. Allen - 2014 - ‎No preview - ‎More editions
    This handbook, the first of its kind, is a detailed introduction to the numerous academic perspectives we can apply to the study of the internet as a political, social and communicative phenomenon.

    The International Handbook of Internet Research
    Jeremy Hunsinger, ‎Lisbeth Klastrup, ‎Matthew Allen - 2010 - ‎No preview - ‎More editions
    This handbook is a detailed introduction to the numerous academic perspectives we can apply to the study of the internet as a political, social and communicative phenomenon--

    The Internet Research Handbook: A Practical Guide for ...
    Niall Ó Dochartaigh - 2002 - ‎Snippet view - ‎More editions
    Ideal as a course textbook at undergraduate and graduate level in a range of social science disciplines where doing a research project is an integral part of the course.

    Internet Communication and Qualitative Research: A ...
    Publisher description: "The internet is exploding with possibilities for conducting social research. Mann and Stewart offer the first in-depth consideration of the prospects and potentials for doing qualitative research on-line.

    Research Handbook on Governance of the Internet
    Ian Brown - 2013 - ‎Preview - ‎More editions
    This Handbook provides a comprehensive overview of the latest research on Internet governance, written by the leading scholars in the field.

    Research Handbook on Governance of the Internet
    Ian Brown - 2013 - ‎Preview - ‎More editions
    Ian Brown. Research Handbook on Governance of the Internet Edited by Ian Brown RESEARCH HANDBOOK ON GOVERNANCE OF THE INTERNET Research Handbook on. Front Cover.

    Research Handbook on EU Internet Law: - Page 37
    Andrej Savin, ‎Jan Trzaskowski - 2014 - ‎Preview - ‎More editions
    Andrej Savin, Jan Trzaskowski. 2. Net neutrality law ChristopherMarsden I. INTRODUCTION Network neutrality is a growing policy controversy relating to traffic management techniques used by Internet Service Providers (ISPs), which affects ...

    The Oxford Handbook of Internet Studies
    William H. Dutton - 2013 - ‎Preview - ‎More editions
    The Oxford Handbook of Internet Studies has been designed to provide a valuable resource for academics and students in this area, bringing together leading scholarly perspectives on how the Internet has been studied and how the research ...

    The Extreme Searcher's Guide to Web Search Engines: A ...
    Randolph Hock - 2001 - ‎Snippet view - ‎More editions
    Describes search strategies, explains how documents are pulled into engines's databases, and explores the special characteristics of such Internet search engines as AltaVista, Excite, Infoseek, Yahoo!, HotBot, and Lycos.

Metacrawlers and Metasearch Engines

by ,
SEO Evolution: Sell, Discover, Deliver & Report on Highly Converting Keywords by Krista LaRiviere, gShift
Unlike search engines, metacrawlers don't crawl the web themselves to build listings. Instead, they allow searches to be sent to several search engines all at once. The results are then blended together onto one page. Below are some of the major metacrawlers. Also see the Search Toolbars & Utilities page for metacrawler-style software that you can run from your desktop.

Award Winners

Popular metasearch site owned by InfoSpace that sends a search to a customizable list of search engines, directories and specialty search sites, then displays results from each search engine individually. Winner of Best Meta Search Engine award from Search Engine Watch for 2003. (Review: Dogpile Sports a Fetching New Look, SearchDay, Sept. 2, 2003. Updates: Dogpile Enhances Search Results Search Engine Watch Blog, Nov. 10, 2004 - Dogpile Adds New Features Search Engine Watch Blog, Jan. 18, 2005 )
Enter a search term, and Vivismo will not only pull back matching responses from major search engines but also automatically organize the pages into categories. Slick and easy to use. Vivisimo won second place for Best Meta Search Engine in the 2003 Search Engine Watch awards and winner in 2002. (Review: Power Searching with Vivisimo, SearchDay, July 8, 2003)
If you like the idea of seeing your web results visually, this meta search site shows the results with sites being interconnected by keywords. Honorable mention for Best Meta Search Engine award from Search Engine Watch in 2002.
Founded in 1996, is one of the oldest meta search engines on the web. Mamma searches against a variety of major crawlers, directories and specialty search sites. The service also provides a paid listings option for advertisers, Mamma Classifieds. Mamma was an honorable mention for Best Meta Search Engine in the 2003 Search Engine Watch awards.
Searches against major engines or provides those who open free accounts the ability to chose from a list of hundreds. Using the "SiteSnaps" feature, you can preview any page in the results and see where your terms appear in the document. Allows results or documents to be saved for future use. Honorable mention for Best Meta Search Engine award from Search Engine Watch in 2002.

Other Top Choices

Clusty, from Vivisimo, presents both standard web search results and Vivisimo's dynamic clusters that automatically categorize results. Clusty allows you to use Vivisimo's dynamic clustering technology on ten different types of web content including material from the web, image, weblog and shopping databases. You can access each type of search by simply clicking a tab directly above the search box. (Review: Reducing Information Overkill, SearchDay, Sept. 30, 2004).
Meta search engine for the US and several European countries, as well as in various subject areas. Has ability to save your results for easy rerunning at a future point.
Formerly a crawled-based search engine, Excite was acquired by InfoSpace in 2002 and uses the same underlying technology as the other InfoSpace meta search engines, but maintains its own portal features.
Fazzle offers a highly flexible and customizable interface to a wide variety of information sources, ranging from general web results to specialized search resources in a number of subject specific categories. Formerly called SearchOnline.
Gimenei queries an undisclosed number of search engines and removes duplicates from results. Its most useful feature is an advanced search form that allows you to limit your search to a specific country.
Meta search engine with thumbnail displays. The Quick View display, similar to what WiseNut has long offered, is cool. The service queries WiseNut, Yahoo, Teoma and then somewhat repetitively also includes Yahoo-powered MSN, AltaVista and AllTheWeb. Disclosure of search sources within the actual search results is not done, sadly. Makes it hard to know exactly where the results are coming from. provides results from 14 search engines and pay-per-click directories, including Google, Ask Jeeves, Yahoo, Kanoodle, LookSmart, About, Overture and Open Directory. Also offers shopping, news, eBay, audio and video search, as well as a number of other interesting features. (Review: New Metasearch Engine: Search Engine Watch Blog, Oct. 18, 2004)
In a compact format, InfoGrid provides direct links to major search sites and topical web sites in different categories. Meta search and news searching is also offered.
Infonetware RealTerm Search
This site is primarily designed to demonstrate classification technology from Infogistics. It's a meta search engine, and it does topical classification of results, like Vivisimo. However, it is unique in that you can select several different topics, then "drill down" to see results from all of them, rather than being restricted to the results from only one topic.
Meta search engine that ranks results based on the number of "top 10" rankings a site receives from the various search engines.
iZito is a meta search engine with a clever feature. Click on any listing you are interested in using the P icon next to the listing title. That "parks" the listing into your to do list. Click on the P tab, and you can see all the pages you've culled. It's an easy, handy way to make a custom result set. Also interesting is the ability to show listings in up to three columns across the screen, letting you see more results at once. (Review: iZito & Ujiko: Meta Search With Personality Search Engine Watch Blog, Sept. 29, 2004)
This search result comparison tool is cool. It allows you to search two major search engines at the same time, then see results that are found on both first, followed by results found on only one of them next. The small overlap visual tool displayed is great. I used to make examples like this to explain search engine overlap and why one search engine may not cover everything. Now I have an easy dynamic way to do this. The stats link at the bottom of the home page provides more visuals. (Update: Jux2 Adds New Features, Search Engine Watch Blog, Oct. 13, 2004)
Meta search with the ability to create an "exclusion list" to block pages from particular web sites being included. For example, want to meta search only against .org sites? French version also offered.
One of the oldest meta search services, MetaCrawler began in July 1995 at the University of Washington. MetaCrawler was purchased by InfoSpace, an online content provider, in Feb. 97.
Search against several major search engines and paid listings services. Offers a nice option to see Alexa info about pages that are listed.
Brings back listings from several major search engines as well as "Invisible Web" resources. Formerly based at the University of Kansas, the site was purchased by search company Intelliseek in April 2000.
Query Server
Search against major web-wide search engines, as well as major news, health, money and government search services.
Turbo10 is a metasearch Engine accesses both traditional web search engines and some invisible web databases, with a very speedy interface. (Review: Make way for the contender to Google's crown, The Register, May 30, 2003) is a meta search engine operated by CNET. It offers both web-wide search and a wide variety of specialty search options. absorbed SavvySearch in October 1999. SavvySearch was one of the older metasearch services, around since May 1995 and formerly based at Colorado State University.
From the makers of visual meta search tool KartOO, this is a really slick service to try. Do your search, then scroll through the list. See something bad? Click the trash can icon, and the listing goes away. It's a great way to prune your results -- even better would have been if everything trashed brought up something new to look at. That would be a help for those who simply refuse to go past the first page of results.
See something you like? Click the heart icon and you can rate the listing. This information is memorized, to help ensure the sites you choose to better in future searches. Unlike KartOO, Ujiko uses results from only one search engine: Yahoo. It also offers many more features I haven't even yet explored, but you can learn more about them here: Gary Price also gives a rundown here: The only downside? Flash is required.
Formerly a crawled-based search engine owned by Excite, Webcrawler was acquired by InfoSpace in 2002 and uses the same underlying technology as the other InfoSpace meta search engines, but offers a fast and clean, ad-free interface.
Provides a variety of ways to sort the results retrieved, plus provides interesting visualization tools and other features. (Review: ZapMeta: A Promising New Meta Search Engine, Feb. 26, 2004)

Specialty Choices

The metacrawlers listed below let you meta search in specific subject areas.
Family Friendly Search
Meta search service that queries major kid-friendly search engines.
Meta search service for licensed and commercially available digital media downloads including music, movies, music videos, ringtones, mobile games and PC games, searching over 12 million media files. (Review: GoFish Multimedia Shopping Search: IceRocket Deal & Closer Look, Search Engine Watch Blog, Feb. 4, 2005)
Searches 15 U.K. engines. The advanced search form allows you to change the order that results are presented, either by speed or manually to suit your own preferences.
Watson for the Macintosh
Watson is a "Swiss Army Knife" with nineteen interfaces to web content and services -- an improvement on Sherlock, with nearly twice as many tools, including Google Searching.

All-In-One Search Pages

Unlike metacrawlers, all-in-one search pages do not send your query to many search engines at the same time. Instead, they generally list a wide-variety of search engines and allow you to search at your choice without having to go directly to that search engine.
Google Versus Yahoo Tool
See visually how results compare on Google versus Yahoo.
One Page MultiSearch Engines
Clean interface lets you query major services from one page.
Lets you easily send your search to one of several search engines. It also has links to search engine help pages.
Queryster lets you quickly get results from one of several major search engines, simply by clicking an icon. (Review: A Fun Multi-Search Tool, Feb. 23, 2004)
Select your search engines from the many choices offered. The results will all appear within one page, side-by-side. It's a great way to compare results, though a bit hard to read with more than two search engines selected.

Meta Search Articles

For other articles and older reviews, also see the Search Engine Reviews page.
Meta Search Engines are Back
SearchDay, Dec. 4, 2003
It's been a busy year for the major meta search engines, with a number of notable developments that have restored their usefulness as worthy search tools.
Meta Search Engines: An Introduction
SearchDay, September 16, 2002
This week, SearchDay focuses on the world of meta search engines, looking under the hood at how they work and profiling the major players and their offerings
The Big Four Meta Search Engines
SearchDay, September 17, 2002
Though there are dozens of useful meta search engines, InfoSpace is the industry gorilla, operating the four arguably best known and most heavily used properties.
The Best and Most Popular Meta Search Engines
SearchDay, September 18, 2002
Meta search engines look pretty much the same up front, but their approach to presenting results varies widely. Here's a list of Search Engine Watch's pick of the best and most popular metas for searching the web.
A Meta Search Engine Roundup
SearchDay, September 19, 2002
Completing our roundup of meta search engines, this list focuses on services that are competent and in many cases worthy of a look, but don't meet all of our evaluation criteria.
Meta Search Or Meta Ads?
The Search Engine Report, June 4, 2001
A review of meta search services by Search Engine Watch shows that some are providing results where more than half of their listings are paid links. A guide to what's paid, what's not and how to get the most from your meta search service.
Looking for more articles and reviews of meta search engines? See the Meta Search category of the Search Topics section of Search Engine Watch available to Search Engine Watch members.

ClickZ Live San Francisco This Year's Premier Digital Marketing Event is #CZLSF
ClickZ Live San Francisco (Aug 11-14) will bring together the industry's leading online marketing practitioners to deliver 4 days of educational sessions and training workshops. From Data-Driven Marketing to Social, Mobile, Display, Search and Email, the comprehensive agenda will help you maximize your marketing efforts and ROI. Early Bird Rates available through Friday, July 18. Register & save!


Welcome to the Internet Search FAQ 

How to Find Information, People, Data, Text, Pictures, Sounds and Almost Anything Else on the Net

FAQ Contents 
  4. HOW CAN I FIND...?
    1. How Can I Find Specific files, texts, multimedia or people? 
    2. How Can I Find Specific information? 
    3. How Can I Find More General Background Information?
    1. How reliable is the Net?
    2. What can I do about it?
    3. How should Internet sources be cited?
  10. URLS FOR A RAINY DAY - Loads of useful links for research of all kinds
This Frequently Asked Questions guide was last modified 1 June 2013

Caught in the Net? Going nowhere on the Information Superhighway? Fear no more.

Help is at hand. The Internet Search FAQ is here to help you find what you want - and retain your sanity in the process ...and find hundreds of essential links for searching in Urls for a Rainy Day
The main FAQ page is designed to be used by anyone, no matter how much or little you know about searching the Net. Unlike books and pages written by experts, it is based on the kinds of questions that typical users ask: why should I use the Internet? What's the best way to find specific things, specific information, more general information? How can I speed up my searches? Will I get better results if I pay? How reliable is the information I find?

 While if you're impatient to get started, then go straight to our essential links and start clicking.

We also look at how to find further assistance, and try to guess what changes are on their way (although with the speed the Net changes, they may be happening even as you read).

News and New URLs
The latest on searching and our list of newly discovered resources to help you find your way around whatever subject you want to search on the Net... Click Here
For details of these and other new ways of finding information on the Internet go to our what's new page, updated regularly.

Awards for the Internet Search FAQ
The Control Voice "You're Neat Award"
Internet Searching Tools
The following tools and services are designed for searching the Internet for sites and resources.
Note: These tools are ranked based on their interface, versatility, and ease of use. The How to Search the Internet provides useful tools to learn about Internet searching. The Best of the Rest provides an eclectic list of other useful resources for a variety of Internet searching needs.

Quick Links to the Top Twelve Search Tools by Category
Click the header for a short description of each tool and a link to the Help menu.
Google Advanced Search
Advanced Search
Alta Vista Advanced Search
Duck Duck Go
Gigablast Advanced Search
Lycos Advanced Search

Ixquick Metasearch

Monster Crawler
Virtual Reference Shelf
Internet Public Library
Digital Librarian
Best of the Web

Open Directory Project
World Wide Web Virtual Library
Complete Planet:The Deep Web A1WebDirectory.orgJoeAnt

Finding Information on the Internet: A Tutorial
BARE BONES 101: A Web Search Tutorial
Web Search Tutorial by Pandia
Web Search from
Best Search Tools Chart
Graduated Search Strategy
How to Choose a Search Engine
Tips for Effective Internet Searching
Evaluating Web Resources
Choose the Best Search Tool
Search Engine Showdown
Search Engine Watch
Google Books
Google Scholar
Google UncleSam
How Stuff Works
Essential Links (EL)
Search Engine Colossus
Yahoo BabelFish Language Translator

U.C. Berkeley

Library Web

Finding Information on the Internet: A Tutorial
Invisible or Deep Web: What it is, How to find it, and Its inherent ambiguity
UC Berkeley - Teaching Library Internet Workshops
About This Tutorial | Table of Contents | Contact us

What is the "Invisible Web", a.k.a. the "Deep Web"?

The "visible web" is what you can find using general web search engines. It's also what you see in almost all subject directories. The "invisible web" is what you cannot find using these types of tools.
The first version of this web page was written in 2000, when this topic was new and baffling to many web searchers. Since then, search engines' crawlers and indexing programs have overcome many of the technical barriers that made it impossible for them to find "invisible" web pages.
These types of pages used to be invisible but can now be found in most search engine results:

  • Pages in non-HTML formats (pdf, Word, Excel, PowerPoint), now converted into HTML.
  • Script-based pages, whose URLs contain a ? or other script coding.
  • Pages generated dynamically by other types of database software (e.g., Active Server Pages, Cold Fusion). These can be indexed if there is a stable URL somewhere that search engine crawlers can find.
Why isn't everything visible?
There are still some hurdles search engine crawlers cannot leap. Here are some examples of material that remains hidden from general search engines:
  • The Contents of Searchable Databases. When you search in a library catalog, article database, statistical database, etc., the results are generated "on the fly" in answer to your search. Because the crawler programs cannot type or think, they cannot enter passwords on a login screen or keywords in a search box. Thus, these databases must be searched separately.

    • A special case: Google Scholar is part of the public or visible web. It contains citations to journal articles and other publications, with links to publishers or other sources where one can try to access the full text of the items. This is convenient, but results in Google Scholar are only a small fraction of all the scholarly publications that exist online. Much more - including most of the full text - is available through article databases that are part of the invisible web. The UC Berkeley Library subscribes to over 200 of these, accessible to our students, faculty, staff, and on-campus visitors through our Find Articles page.
  • Excluded Pages. Search engine companies exclude some types of pages by policy, to avoid cluttering their databases with unwanted content.

    • Dynamically generated pages of little value beyond single use. Think of the billions of possible web pages generated by searches for books in library catalogs, public-record databases, etc. Each of these is created in response to a specific need. Search engines do not want all these pages in their web databases, since they generally are not of broad interest.
    • Pages deliberately excluded by their owners. A web page creator who does not want his/her page showing up in search engines can insert special "meta tags" that will not display on the screen, but will cause most search engines' crawlers to avoid the page.

How to Find the Invisible Web

Simply think "databases" and keep your eyes open. You can find searchable databases containing invisible web pages in the course of routine searching in most general web directories. Of particular value in academic research are:
Use Google and other search engines to locate searchable databases by searching a subject term and the word "database". If the database uses the word database in its own pages, you are likely to find it in Google. The word "database" is also useful in searching a topic in the Yahoo! directory, because they sometimes use the term to describe searchable databases in their listings.
plane crash database
languages database
toxic chemicals database

Remember that the Invisible Web exists. In addition to what you find in search engine results (including Google Scholar) and most web directories, there are other gold mines you have to search directly. This includes all of the licensed article, magazine, reference, news archives, and other research resources that libraries and some industries buy for those authorized to use them.
As part of your web search strategy, spend a little time looking for databases in your field or topic of study or research. The contents of these may not be freely available: libraries and corporations buy the rights for their authorized users to view the contents. If they appear free, it's because you are somehow authorized to search and read the contents (library card holder, company employee, etc.).

The Ambiguity Inherent in the Invisible Web: It is very difficult to predict what sites or kinds of sites or portions of sites will or won't be part of the Invisible Web. There are several factors involved:
    • Which sites replicate some of their content in static pages (hybrid of visible and invisible in some combination)?
    • Which replicate it all (visible in search engines if you construct a search matching terms in the page)?
    • Which databases replicate none of their dynamically generated pages in links and must be searched directly (totally invisible)?
    • Search engines can change their policies on what they exclude and include.

Want to learn more about the Invisible Web?

Quick Links
Search Engines |Subject Directories | Meta-Search Engines | Invisible Web
Creative Commons License
Invisible Web: What it is, Why it exists, How to find it, and Its inherent ambiguity
Copyright © 2012 The Regents of the University of California is licensed
under a Creative Commons Attribution-NonCommercial 3.0 Unported License.
Permissions beyond the scope of this license may be available at

Last update 05/08/12. Server manager: Contact


Architecture of the World Wide Web, Volume One

W3C Recommendation 15 December 2004

This version:
Latest version:
Previous version:
Ian Jacobs, W3C
Norman Walsh, Sun Microsystems, Inc.
See acknowledgments (§8).
Please refer to the errata for this document, which may include some normative corrections.
See also translations.


The World Wide Web uses relatively simple technologies with sufficient scalability, efficiency and utility that they have resulted in a remarkable information space of interrelated resources, growing across languages, cultures, and media. In an effort to preserve these properties of the information space as the technologies evolve, this architecture document discusses the core design components of the Web. They are identification of resources, representation of resource state, and the protocols that support the interaction between agents and resources in the space. We relate core design components, constraints, and good practices to the principles and properties they support.

Status of this document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at
This is the 15 December 2004 Recommendation of “Architecture of the World Wide Web, Volume One.” This document has been reviewed by W3C Members, by software developers, and by other W3C groups and interested parties, and is endorsed by the Director as a W3C Recommendation. It is a stable document and may be used as reference material or cited from another document. W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability of the Web.
This document was developed by W3C's Technical Architecture Group (TAG), which, by charter maintains a list of architectural issues. The scope of this document is a useful subset of those issues; it is not intended to address all of them. The TAG intends to address the remaining (and future) issues now that Volume One is published as a W3C Recommendation. A complete history of changes so this document is available. Please send comments on this document to (public archive of public-webarch-comments). TAG technical discussion takes place on (public archive of www-tag).
This document was produced under the W3C IPR policy of the July 2001 Process Document. The TAG maintains a public list of patent disclosures relevant to this document; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) with respect to this specification should disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents

List of Principles, Constraints, and Good Practice Notes

The following principles, constraints, and good practice notes are discussed in this document and listed here for convenience. There is also a free-standing summary.
Data Formats
General Architecture Principles

1. Introduction

The World Wide Web (WWW, or simply Web) is an information space in which the items of interest, referred to as resources, are identified by global identifiers called Uniform Resource Identifiers (URI).
Examples such as the following travel scenario are used throughout this document to illustrate typical behavior of Web agents—people or software acting on this information space. A user agent acts on behalf of a user. Software agents include servers, proxies, spiders, browsers, and multimedia players.
While planning a trip to Mexico, Nadia reads “Oaxaca weather information: ''” in a glossy travel magazine. Nadia has enough experience with the Web to recognize that "" is a URI and that she is likely to be able to retrieve associated information with her Web browser. When Nadia enters the URI into her browser:
  1. The browser recognizes that what Nadia typed is a URI.
  2. The browser performs an information retrieval action in accordance with its configured behavior for resources identified via the "http" URI scheme.
  3. The authority responsible for "" provides information in a response to the retrieval request.
  4. The browser interprets the response, identified as XHTML by the server, and performs additional retrieval actions for inline graphics and other content as necessary.
  5. The browser displays the retrieved information, which includes hypertext links to other information. Nadia can follow these hypertext links to retrieve additional information.
Darkbird18 does online Internet Research and this is one of the best website to start with to understand what the Internet, WWW, and all the resources are all about. have been online for a very long time and have very "Good Internet Information" with detail resources to help anyone who wants to understand what the Internet is all about. What I have been running into is Internet User who have no idea what is going on on the Internet and when some goes wrong and they start having online problems, they just get mad and give up! The Internet is part of the New Age of Information and has change the way we do everything and it's a good idea to have some basic idea how it work, W3C will aid you in this; so read on because they have very detail information on this subject...........................................

Architecture of the World Wide Web, Volume One


Developing an Internet Business Plan:

Darkbird18 find this Internet Business Plan by mistake and was glad for it because the Internet Business Plan is real hard to find in one piece! I started looking for the Plan back in 1995 and had a very hard time finding it because no one was trying to develop the Internet Business Plan because they were all too busy trying to get rich! But I keep looking because the foundation is how things are done. A PiTT University student in 1992  Michael Yellin, MBA/MS-MoIS Student, wrote this plan and thank God he put it up online because without it there would be nothing to help new online businesses to go on. Read this plan and you will be ready to setup your online business and make the Internet yours.

Developing an Internet Business Plan*

If you are interested in developing a new business on the Internet or expanding your current business onto the global information superhighway, it is important to develop a business plan as part of your preparations. Like a regular business plan, your Internet business plan must give details of the proposed venture, along with expected needs and results (Kuratko and Hodgetts, 1992).
In addition, it must take into account the unique nature of electronic commerce.

Purpose of a Business Plan

A business plan is a proposal for a new venture. It is designed to convince the reader to support the proposed project. If the presenter of the plan is an entrepreneur, the plan's purpose is to raise capital for the project from investors. If the plan is being presented by an employee within a company, then the plan's purpose is to convince internal management to undertake the new project. This planning also has another purpose: to force the entrepreneur to do thorough and effective analysis.

Internet Business Issues

Electronic commerce on the Internet is relatively new and poses many unique challenges. First, most people do not know exactly what the Internet is or what it can offer businesses. This is a hurdle that must be overcome in your business plan. Second, resources that are taken for granted in the real world often do not exist or are in formative stages in the on-line world. For example, payment systems, ad page pricing, and market demographic tracking are all in various stages of development on-line. Third, the pace on the Internet is dizzying. Keeping track of the rapidly changing trends, technology, and competitors is crucial to the success of your business.

How Much Work is it?

Just like in any endeavor, you would not make substantial investments without careful research and understanding of what you are doing. One example of "Intellichild," a real "dot com business plan," has been posted at To my knowledge, this is the only site offering a variety of real business plans free, on line. You can judge the effort required to put together a plan that builds a significant business case.

The Ten Sections of an Internet Business Plan

(all but #5, #6, and #10 are required for our course)
  1. Executive Summary (required): This section must concisely communicate the basics of your entire business plan. Keep in mind that your reader may be unfamiliar with the Internet and its tremendous potential.
  2. Business Description (required): In this section discuss your firm's product or service along with information about the industry. Because your business plan revolves around the Internet, spend some time explaining it first. Then describe how your product and the Internet fit together or complement each other. As with any business plan, consider your audience. If the readers are technically unsophisticated, make sure you include definitions along with any technological terminology.
  3. Marketing Plan (required): With the business described, next you must discuss your target market, identify competitors, describe product advertising, explain product pricing, and discuss delivery and payment mechanisms.
    • Customers: You must define who your customers are and how many of them exist on the Internet. There are demographic studies by organizations such as The Internet Society and The Internet Group that can help you determine this.
    • Competitors: Use Internet search engines to look for known competitors or similar products to yours. Be sure to use several search engines, because each uses different search techniques. After you have identified your competitors, perform a new search every few weeks or months. Companies are continuously joining the Internet. Remember, readers of your business plan will be very interested in how you are going to beat the competition.
    • Advertising: Describe how you are going to tell the Internet community about your product or service. Designing beautiful Web pages is only a first step. You must also get the word out about your Web site. Some tips: add your Web address to the databases of search engines such as Lycos and WebCrawler, submit it to What's New at NCSA Mosaic, and add it to the bottom of all of your e-mail messages.
    • Pricing: How are you setting prices for your products or services? If your product is intangible information delivered over the Internet, you should try to create some sort of pricing model to justify your prices. You could start by researching what others are charging for similar products.
    • Delivery & Payment: How are you going to deliver your product and get paid? E-mail alone is not secure. Consider encryption techniques like PGP, and on-line payment services such as DigiCash.
  4. Research & Development (required): This is where to get into the technical aspects of your project. Address where the project is now, the R&D efforts that will be required to bring it to completion, and a forecast of how much the project will cost. Since the Internet is continually developing, you should also address continuing plans for R&D.
  5. Operations & Manufacturing (not required): In this section, discuss the major aspects of the business, including daily operations and physical location. Also, what equipment will your business require? Will you be using your own Web server, or will you be contracting with another company? Who will be your employees -- will you hire Internet knowledgeable staff, or train them in-house? Be sure to include cost information.
  6. Management (not required): This segment must address who will be running the business and their expertise. Because the business centers around the Internet, be sure to discuss the management team's level of Internet expertise and where they gained it. Also, describe your role in the business.
  7. Risks (required): In this section, you must define the major risks facing the proposed business. In addition to regular business risks such as downward industry trends, cost overruns, and unexpected entry of competitors, also include risks specific to the Internet. For example, be sure to address the issues of computer viruses, hacker intrusions, and unfavorable new policies or legislation.
  8. Financial (required): Potential investors will pay close attention to this area, since it is a forecast of profitability. As in a regular business plan, include all pertinent financial statements. Remember to highlight the low expenses associated with operating on the Internet compared to those of other business.
  9. Timeline (required): In this section, you must lay out the steps it will take to make your proposal a reality. When developing this schedule, it might be helpful to talk to other Internet businesses to get an idea of how long their Internet presences took to establish.
  10. Bibliography and Appendices (not required): In addition to business references, include some Internet references in case your readers would like to learn more about the Internet as a part of studying your proposal.


You should now have a better idea of what is involved in developing a winning Internet business plan. Remember, the most important points are: addressing the uniqueness of the Internet, explaining its business advantages and potential, and keeping your audience in mind. For further information, the following two sources may be helpful.


Kuratko, Donald F., and Hodgetts, Richard M. Entrepreneurship: A Contemporary Approach, Dryden Press, 1992.
Resnick, Rosalind, and Taylor, Dave. The Internet Business Guide: Riding the Information Superhighway to Profit, SAMS Publishing, 1994 * Adapted from a document by Michael Yellin, MBA/MS-MoIS Student
Developing an Internet Business Plan

Technorati Tags:



WWW Overview



WWW stands for World Wide Web. A technical definition of the World Wide Web is : all the resources and users on the Internet that are using the Hypertext Transfer Protocol (HTTP).
A broader definition comes from the organization that Web inventor Tim Berners-Lee helped found, the World Wide Web Consortium (W3C).
The World Wide Web is the universe of network-accessible information, an embodiment of human knowledge.
In simple terms, The World Wide Web is a way of exchanging information between computers on the Internet, tying them together into a vast collection of interactive multimedia resources.
Internet and Web is not the same thing: Web uses internet to pass over the information.


World Wide Web was created by Timothy Berners Lee in 1989 at CERN inGeneva. World Wide Web came into existence as a proposal by him, to allow researchers to work together effectively and efficiently at CERN. Eventually it became World Wide Web.
The following diagram briefly defines evolution of World Wide Web:

WWW Architecture

WWW architecture is divided into several layers as shown in the following diagram:

Identifiers and Character Set

Uniform Resource Identifier (URI) is used to uniquely identify resources on the web and UNICODE makes it possible to built web pages that can be read and write in human languages.


XML (Extensible Markup Language) helps to define common syntax in semantic web.

Data Interchange

Resource Description Framework (RDF) framework helps in defining core representation of data for web. RDF represents data about resource in graph form.


RDF Schema (RDFS) allows more standardized description of taxonomiesand other ontological constructs.


Web Ontology Language (OWL) offers more constructs over RDFS. It comes in following three versions:
  • OWL Lite for taxonomies and simple constraints.
  • OWL DL for full description logic support.
  • OWL for more syntactic freedom of RDF


RIF and SWRL offers rules beyond the constructs that are available from RDFsand OWL. Simple Protocol and RDF Query Language (SPARQL) is SQL like language used for querying RDF data and OWL Ontologies.


All semantic and rules that are executed at layers below Proof and their result will be used to prove deductions.


Cryptography means such as digital signature for verification of the origin of sources is used.

User Interface and Applications

On the top of layer User interface and Applications layer is built for user interaction.

WWW Operation

WWW works on client- server approach. Following steps explains how the web works:
  1. User enters the URL (say, of the web page in the address bar of web browser.
  2. Then browser requests the Domain Name Server for the IP address corresponding to
  3. After receiving IP address, browser sends the request for web page to the web server using HTTP protocol which specifies the way the browser and web server communicates.
  4. Then web server receives request using HTTP protocol and checks its search for the requested web page. If found it returns it back to the web browser and close the HTTP connection.
  5. Now the web browser receives the web page, It interprets it and display the contents of web page in web browser’s window.


There had been a rapid development in field of web. It has its impact in almost every area such as education, research, technology, commerce, marketing etc. So the future of web is almost unpredictable.
Apart from huge development in field of WWW, there are also some technical issues that W3 consortium has to cope up with.

User Interface

Work on higher quality presentation of 3-D information is under deveopment. The W3 Consortium is also looking forward to enhance the web to full fill requirements of global communities which would include all regional languages and writing systems.


Work on privacy and security is under way. This would include hiding information, accounting, access control, integrity and risk management.


There has been huge growth in field of web which may lead to overload the internet and degrade its performance. Hence more better protocol are required to be developed.
PDF Download of WWW OverView

No comments:

Related Posts Plugin for WordPress, Blogger...