Semantics is the study of the meaning of words. There are multiple connotations, denotations, and usages for thousands of words, and they could have multiple meanings based on their context and the placement in the sentence. With the currently popular style of web indexing, semantics poses a difficult issue when categorizing a web site. A search for “horror” is going to pull up websites with the word in the text, but a user could be searching for something more specific, such as the literary or film genre, quotes containing the word “horror”, or many other things.
Google, for example, has made a few adaptations to their search methods to accommodate for this situation, and to increase relevant results, they may also include synonyms to words or suggest other searches which may be relevant. Still, the search engine relies heavily on the knowledge of the user and their skill in effective searching. It's certainly a very effective method, but there are better ways in which it could be done.
A concept of a “Semantic Web”, where web pages are readable by both humans and computers, has been floating around the tech communities for a while now. It is the idea that information on the web can be re-categorized and fitted to be more accessible for machine-reading instead of rehashing concepts to better work with the current system.
The Semantic Web in a Nutshell
The idea of a Semantic Web is not new, but active work towards it are still in infancy. Several models have been conceptualized to accommodate for the refining of the Internet on the base level, including: Resource Description Framework (RDF), Web Ontology Language (OWL), and others. These models seek to integrate with current Internet technology as well as provide an avenue to build new technologies based on them for new content.
RDF is a data-interchange format that could add to the meta-data currently supported by HTML. There are vast limitations to the current meta-data allowed by HTML as it was designed to work with the current system, but if languages were developed to expound on that data, organize it, and make it easily referenced by machines, it could raise meta-data from being a marginalized formality to a powerful web-design tool.
OWL is a group of languages which would focus on the ontology of websites, a study of the relationships between concepts within a domain of discourse. The goal is to build a network between search terms and the ideas behind those terms, so the “idea” of a web page can be logically organized in multiple ways that can be manipulated by a computer. All of that translates into better searches for users.
Semantic Searching and Its Future
Semantic searching is the user-level of the Semantic Web technologies. General queries made by the user would return results that make more sense to the user. In the “horror” example, the search engine would take into account cultural concepts, and instead of listing websites which simply contain the word, it would return results on the broader concept of “horror”.
There are some issues with that example, of course, where a user may be searching for specific information on horror, but the idea is that lemmas (words with multiple, contradictory meanings) would be analyzed for the probability of the meaning based upon the other search terms used as well as the content and meta-data of the web page itself. As it's just now gaining ground, we won't see semantic searching become a reality for a few years now, but just as open source software is slowly blossoming into an acceptable solution for normal users, semantic searching will become an industry standard for search engines over time.
As it stands, search engines rely on a symbiotic relationship between the publisher and the searcher to arrange words in the proper way (i.e. SEO) to make results the best they can be. Popularity is also taken into account with certain websites being ranked higher as they are cited more frequently. As is well known, the current system allows for anyone to disseminate information regardless of whether it is correct or simply popular opinion, but semantic searching can toss aside the old search results and give you multiple opinions that are on the topic despite their perceived popularity. The human element will always be necessary to decipher what information is factual, but at least the sources will not be entirely based on how well the web master can game the system.
Related Post:
External Link:
Post new comment