Search Engine Marketing

W3C Publishes new Rule Interchange Format for Business Databases

The w3c working group on Rule Interchange Format has this week published a new semantic standard to increase interoperability between data bases regardless of their technologies.  Recently,  rule-based technologies have attracted a great deal of activity, namely in the areas of  business rules processing and rule based reasoning in the context of the Semantic Web.

TheW3C Rule Interchange Format (RIF) Working Group has published several public specifications of the new W3C RIF standard. The Object Management Group has published its specification for Production Rules Representation, which is being aligned with W3C’s RIF.  OMG™ is an international, open membership, not-for-profit computer industry consortium. OMG Task Forces develop enterprise integration standards for a wide range of technologies, and an even wider range of industries.   People around the world are already working on implementing  these specifications for business purposes. Also, business rules are beginning to  play an important part in Web Services in general as well as Business Process Management.  

The Corporate Semantic Web research group at the Freie Universitaet Berlin has been deeply involved in standardisation. Prof. Dr. Adrian Paschke, who leads the CSW group, has co-edited several of the W3C RIF specifications. Together, they allow systems using a variety of rule languages and rule-based technologies to interoperate with each other and with other Semantic Web technologies. The drafts define XML formats with formal semantics for storing and transmitting rules between systems.

This represent a significant step forwards in achieving Tim Berner-Lee’s vision of an “intelligent”  interoperable web of data that frees data components from their locations.  Although much of the focus on implementation of the new standard will be on business processes, the potential for for increasing transparency amongst government agencies can’t be overlooked.

The new RIF standard will be featured at the RuleML 2009 symposium in a W3C RIF workshop in November 2009 (see http://2009.ruleml.org), co-located with the Business Rules Forum in Las Vegas, Nevada, USA.

Semantic Discussion Heats Up; Critics Call Up Wittgenstein

There’s been a great deal of discussion regarding the wide spread adoption of Semantic web standards now that Google, Yahoo and Bing have admitted their algorithmic adjustments to RDF and microformats.   What’s equally interesting to note is as the standards spread like wildfire the number of  voices  that are raised in alarm at the sinister implications of  smart algorithms being able to interpret contextual meta data in the same manner as a human reader and deliver more relevant results for SE queries.  Justin Leavesley goes as far as to invoke Wittgenstein’s  Tractatus Logico-Philosophicus in his efforts to dispute the notion of a contextual mark up of enhanced data being used to identify concepts on the web  with Unique Resource Identifiers (as opposed to Unique Resource Locations).  In spite of Wittgenstein himself refuting many of the tracts assumptions in his later work, the core of the arguments shared by many at the It’s Just Semantics FriendFeed forum appears to pick up the assertion that Semantic mark up  is dangerous because it establishes set rules for the meaning of data components as they are processed by algorithms.  The notion that standards would be adopted as static  rules seems to have sent chills up and down the spines of some commentators.  And a case could be made for some form of info tyranny if indeed the w3c was in the process of ascribing  a definitive bible of meanings for web based  data.  However, the truth is far from it.  There is no uniformity to RDF ascribed ontologies.  Indeed, RDF uses URIs as symbols to denote concepts. Anyone can create a URI to denote anything they like and they get the right to say what it means. If one don’t agree with the meaning then one doesn’t use it,and substitutes a different URI.   If a third person wants to use both the descriptions then they can use RDF to relate the two URIs if they so desire. They can do this privately or publicly.   What could be more democratic?

As far as definitive rules obscuring natural language and fuzzy logic, the Semantic Web is just a way to organize what people know - power of a database underneath, flexibililty of folksonomy on top, efficient algorithms in between.     Meanwhile,  when an inference engine hits a snag interested parties will get a notice and fix the problem much will be done manually wiki-style with an edit or a few clarifying triples.

What’s not to like?

The San Jose SemTech conference  ended last month with a treasure trove of vodcast discussions regarding Semantic adoption and its practical applications.  The SemTech Panel Discussions are well worth checking out.

The BBC now practically applies Semantic markup to it’s music website site and even offers a tutorial on utilising SPARQL queries  to its music tracks.

Google Addresses Canonical Data With Mark Up Standard

One of the biggest hazards to efficient ranking on Google’s index has been the proliferation of duplicate data on the web.  Duplicate content can pose quandaries for trawling spiders and dilute the relevance of primary content published on a  website.   In  nutshell if your url http://mydomain.co.uk  is also findable as http://www.mydomain.co.uk  or even  http://mydomain.co.uk/, then Google’s algorithms may register each domain as a separate location even when the content is identical.

But fortunately Google likes ot play fair on occasion and has spcifically developed a Canonical Link Element for webmaster to include in order to refer their spiders from to the primary domain.  Google engineer Matt Cutts explains the details here.

To specify a canonical link to the page http://www.example.com/product.php?item=swedish-fish, create a <link> element as follows:

<link rel=”canonical” href=”http://www.example.com/product.php?item=swedish-fish”/>

Copy this link into the <head> section of all non-canonical versions of the page, such as http://www.example.com/product.php?item=swedish-fish&sort=price.

Posted in: Other Hats by admin 1 Comment

Google Squared: Google’s New Semantic Search Product

As the highest valued brand in the world, Google cannot afford to lag behind in the innovations that are developing right in the middle of their industry front yard.  The expansion of Semantic web development from academic niche to application developments to wide adoption  has not gone unnoticed by the alumni at Mountview, California (headquarters for the global Google enterprise.  Google has now rolled out Google Squared

Square Pegs for Google Search  Holes

Square Pegs for Google Search Holes

their first specifically tagged semantic search product from  the biggest search company in the world.   The pre-announcement was made last month on May 12, 2009 and it  was launched just two weeks ago on the Google Labs area of the Google website. Google Squared was developed at Google’s New York office. It is the first significant effort by Google to permit their algorithms a rudimentary understanding of the information  architecture on the web and render the results to a query in  new and more useful ways.

Google Squared extracts structured data from across the web and presents its results in spreadsheet-like format. Each search query return a table of search results which has its own square or set of columns depicting common attributes that are associated with the topic of a search.  Nathania Johnson of Search Engine Watch describes Squared as “quite possibly … one of Google’s significant achievements”.

This response to query based on associations and contextual inference is of course the foundation stone of the semantic web  project.  My adopting this functionality as a response to its users demands, Google has effectively answered those pundits who have queried the very notion of semantic structured data as relevant to the emerging architecture of the changing web

Our advice is to try it for yourself and see the first hand the direction search is going thanks to Tim Berners-Lee and the Semantic web development movement.

Nasa Adopts Semantic Formating for Space Exploration Data

TopQuadrant, a Semantic technology company announced today that NASA their TopBraid Suite, to “model, organize, integrate and exchange data” within the NASA Constellation Program. The  Program, announced in 2004, is to explore the solar system - starting with a return to the Moon and ultimately aiming to explore Mars and other destinations.

Part of the reason NASA is using TopQuadrant is to reduce operational costs and shorten development cycles.  In spite of Google’s claims for WAVE at this month’s Google I|O according to Nasa, semantic technologies are better at enabling “rapid changes in large, complex and distributed data systems”.  which are necessary to process information on modelling the galaxy.

The TopBraid Suite is a platform for developing semantic applications.  “Users can rapidly assemble, deploy and manage dynamic ontology-driven applications built with the W3C SemanticWeb standards, including RDF/S, OWL and SPARQL.”, according to Top Qudrant.

The Constellation Program uses Semantic technology in different ways. One way is for collaborative purposes, as a means of managing information for diverse groups across the program  including Nasa’s  engineers, technicians, operators and managers. Another reason is to “manage and execute planned and unplanned information changes within a multitude of systems over the duration of the Program, which is expected to continue for 30-40 years.”

NASA is using TopBraid to develop and manage NExIOM, the NASA Exploration Initiatives Ontology Models.

“NExIOM formalizes the way machines and people specify NASA Exploration systems, elements of their Scientific and Engineering disciplines, related work activities, and their interrelationships in the Enterprise. Through the use of model-based knowledge representations and controlled vocabularies information is intelligible and actionable to machines, applications, and people. Information can be reliably found, associated, aggregated, and reasoned over to generate products and inform decisions within and across diverse organizational groups and application domains.”  According to TopQuadrant

NASA has been using  ontologies for aerospace engineering since 2002. Ralph Hodgson, CTO of TopQuadrant,  noted that semantic technology has the same benefits for NASA as it does for all enterprises: improving data quality and extending its longevity, facilitating information flow, and addressing legacy application problems.

As the development of semantic technologies get adopted by more and more data intensive institutions, the tipping point towards mass adoption of the same w3c standards gets closer and closer.

Microsoft’s Bada Bing!

In its never ending quest to take on every IT player on the earth, Micro Soft launched it’s new (or upgraded) search engine Bing on the world yesterday.  Microsoft said that the name Bing “was memorable, short, easy to spell, and that would function well as a URL around the world” and reminded people of the sound made during “the moments of discovery and decision making”.  I suppose a light bulb visual  was discarded as too

Bada Bing!

cartoon-like.  Bing replaces or upgrades Live Search, Windows Live Search and MSN Search),  and was codenamed  Kumo until Bing’s launch in the US yesterday riding the back of  a $80 to $100 million on line, TV, print, and radio advertising campaign. It was reported that the advertisements will not take on other competitive search engines such as Google and Yahoo! directly by name, however, they will instead attempt to convince users to switch to Bing by focusing on Bing’s unique search features and functionalities. So the idea is to compete with Google and Yahoo on the basis of better functionality.

The first public beta of Windows Live Search was unveiled on March 8, 2006, with the final release on September 11, 2006 replacing MSN Search. The new search engine offered users the ability to search for specific types of information using search tabs that include Web, news, images, music, desktop, local, and the now discontinued Microsoft Encarta. Windows Live Search aimed to make its over 2.5 billion worldwide queries each month “more useful by providing consumers with improved access to information and more precise answers to their questions.”

Bing features a wall paper gallery interface so that users are greeted by a different background pciture each time they access the page which is cosmetically an improvement over Google’s rather stalwarth minimalism.  We’re quite happy with Bing because entering the search terms ‘Search Engine Marketing in Oxford’ and ‘SEO in Oxford’ brought up  Oxfordseo.com as a number one ranking.  Which just goes to show that 5 months of ongoing and updating optimisation does pay dividends.

The additional functionality of snippets of additional roll over info indicates some use of microformats but its too early to tell how much the engine will be riding the Semantic layer of meta format data.  Certainly competing with the recently announced  Google’s Rich Snippets

would seem to mandate a move in the Semantic direction.  We’ll keep you posted.

In other Search Engine news there’s some excitement  astir for Mapumental, a new visual search engine that launched today in private beta, which in this iteration helps you work out where you might want to live if you wanted an easy commute to central London.  Register and tell us what you think of it, here.

Posted in: Other Hats by admin 6 Comments

Semantic in a Nutshell: Useful Definition

Semantic data model

Semantic data models.[9]

A semantic data model in software engineering is a technique to define the meaning of data within the context of its interrelationships with other data. A semantic data model is an abstraction which defines how the stored symbols relate to the real world.[9] A semantic data model is sometimes called a conceptual data model.

The logical data structure of a database management system (DBMS), whether hierarchical, network, or relational, cannot totally satisfy the requirements for a conceptual definition of data because it is limited in scope and biased toward the implementation strategy employed by the DBMS. Therefore, the need to define data from a conceptual view has led to the development of semantic data modeling techniques. That is, techniques to define the meaning of data within the context of its interrelationships with other data. As illustrated in the figure. The real world, in terms of resources, ideas, events, etc., are symbolically defined within physical data stores. A semantic data model is an abstraction which defines how the stored symbols relate to the real world. Thus, the model must be a true representation of the real world.[9]

WolframAlpha: How a Computational Engine Progresses the Semantic Web

Stephen Wolfram’s WolframAlpha was launched on May 15th in a well coordinated wave of news coverage

Computational Search

Computational Search

that  promoted the computational engine as a useful alternative if not precisely rival to Google.  It is a as self-described, an  answer-engine developed by Wolfram Research. It is an online service that answers fact-based queries directly by calculating the answer from pre-structured data, rather than just coughing up  a list of documents or web pages that correspond to the language of the query  and answer as a search engine might.  Users submit queries and computation requests via a text field. WolframAlpha then computes answers and relevant visualizations from a knowledge base of curated (by Wolfram’s staff of 100 librarians), structured data. AlphaWolframdoes differ from  semantic search engines that index a large number of answers and then try to match the question to one.  But does bear similarities in using interpretative and contextual references.   WolframAlpha makes inferences from a smaller set of core information. In this way it has many parallels with Cyc, a project aimed since the 1980s at developing a common-sense inference engine.   The database  includes hundreds of datasets, such as ‘All Current and Historical Weather’. The datasets have been accumulated over approximately two years, and WolframAlpha’s staff continue to grow. The range of questions that can be answered will grow with the expansion of the datasets, but are currently rather limited.  This makes the engine rather less than impressive when being test driven by punters; however, the principles of inference based structured data sets is certainly demonstrated by its computations and in that sense can be viewed as another important step in the direction of Semantic web development.

WolframAlpha underlines the growing importance of the Semantic Web

Social media grows as the  number of people connected, the ways in which they connect, and the things they seek to do once online grow every day, yet the fundamental means of connection between all of these people, all of these places, and all of these things remains the simple  hyperlink with all the sophistication of an extended finger pointing into the mid distance.

And this lack of sophistication is precisely what the Semantic web addresses:

The specifications of the technology are about enabling the description of ’stuff’ - and the connections between one piece of stuff and another - to be declared in ways that are explicit, intelligible and actionable  both by humans and software applications acting on their behalf.  Not just finger pointing.

By  adding  semantic associations of a  name with the person, person with the authored work, and both person and work with the ‘act’ of authorship, the statement gains a greater meaning.   By following the  Linked Data Design Issues and expressing these semantics in a ‘linkable’ fashion, the network of relationships between (in this case) me, my communities of interest and my authored works grows stronger and more useful, across the artificial boundaries imposed by siloed communities  as well as proprietary applications.

The importance of convergence and universal data structures is what underlies the effectiveness of this new datastructre.  RDFa, microformats and even WolframAlphas computational engine are all based on developing richer and more annotated data structures that have the effect on responding more accurately and more encompassingly to our queries for data on the net.

And the best is yet to come.

If you need to understand how the Semantic Web can benefit your organisation’s website, please contact us at http://oxfordseo.com for an audit of your web presence.

Google Continues to Walk Towards the Semantic Light

Google : “Structured data makes the web a better place. It also helps Google better understand and present your page in search results”.  And so begins Google’s introduction to its new meta data format branded “Rich Snippets”.  That’s Semantic mark up or the Semantic Web’s passage from childhood into adolescence after it’s 10 year gestation period at World Wide Web Consortium and it’s birth in late 2007 (more or less).  Google is taking the adoption of  Resource Description Frameworks (RDF) and microformats slowly, marking up their data about People and Reviews first with meta data  with the following structure:

  • The writer of the review
  • The date the review was written
  • The rating (for example, 4/5).
  • For items with multiple user reviews, the number of reviews and average rating.

So if  you have a review of a show, a musical gig, an art exhibit or a restaurant on your webpage,  your HTML will show the name of the venue or event, the address and phone number, the number of users who have provided reviews, and the average rating.  This information shows up as the text on your website  that  can be read and understood  by people who know where to find  that information.  They can find it, perhaps because by subscribing to your site’s updates, someone has socially referred them to your site  or because your site is optimised to rank high with that information on the Google index. But to the computer it is nothing but strings of unstructured text. With microformats or RDFa, you can label each piece of text to make it clear that it represents a certain type of data: for example, a venue or event name, the timing of an event, an address, or a rating. This is done by providing additional mark up tags that computers can understand.  That means  your computer (or more accurately the search algorithms your computer uses to seek out information on the net, can got straight to the additional information without having to refer to a scanned index of the page that information is on.

Google: the Latest Semantic Search Engine

Google: the Latest Semantic Search Engine

These “Rich Snippets” are mainly going to enhance your site description tag  that Google places under your link when your site come up as a result of a query.  It will  allow searchers to get a more accurate assessment of how relevant your link is to their query and make searching yield more relevant results, at least for People and Reviews in the short term.

This marks the re entry of the importance meta tag descriptions as an essential for optimisation of a site after a few years in the indifferent wilderness.  But more importnatly it marks Google’s first significant steps (after their March announcement of adopting the  Semantic Web), towards encouraging publishers to adopt structured data standards as advocated by Sir Tim Berners-Lee and the w3c.

It will be awhile still before Google starts advocating the mark up of all web data, not just about People and Reviews and still longer before there enough momentum generated for web publishers to start enhancing their web data with RDFa and microformats.  But as in the immortal words of Sam Cook  Google provides even more evidence that “a change is gonna come”.

Google Becomes Best Evangelist for Semantic Web

After years of quiet indifference to Sir Tim Bern Lee’s initiative to restructure the data published on the web into  a more useful and accessible architecture, Google has taken the extraordinary turn of becoming the industry’s premier evangelist for Semantic mark up.  For those who need a brief refresher course, the Semantic Web is an evolving structure of knowledge, built to allow anyone on the Internet to add what they know and find, as answers to their questions; and cross reference and verify those answers. Information on the Semantic web, as well as being in natural language, is maintained in a structured meta format form which is fairly easy for both computers and people to read and  work with.

This is accomplished in various ways: by adding an enhanced mark up based on RDF (Resource Description Formats), by writing micro formats which are designed to be read by humans first and machines second (micro formats are a set of simple, open data formats built upon existing and widely adopted standards),  or as in the case of  Yahoo’s Search Monkey and Google’s new update  called “Rich Snippets,” sensitize their  spiders to scan the Semantic layer of meta data.  Google’s Marissa Mayer and Kavi Goel recently demonstrated how Google is working with tech publishers, including CNET, to display new types of cross referenced information in search results.   These larger  Search players, are attempting to create an illusion of proprietary search structures when in fact, the brilliance of Semantic web architecture is that it is fundamentally an open architecture whose distinct benefit is that it surpasses proprietary and format barriers. Google said it would not penalise web publishers who did not participate in the Rich Snippets program by making their Web pages less relevant, but Google’s Goel acknowledged that Web pages enhanced with Rich Snippets could see higher click-through rates, which would improve their relevance in Google’s algorithm.

So you don’t really have to conform to semantic mark up, you can continue publishing on the web quite happily, outside of the loop.

The Thames Valley Police in England have a bureaucratic impediment when it comes to cross referencing case information recorded by separate departments on separate data bases.  The Criminal Case Management number for example, when assigned to an incident exists in  a separate data structure than the for example the corresponding Collision Record number, taken down by a different  department on a different database. With a Semantic data structure, the separate data entries related to the same incident would be instantly readable and accessible.  Semantic represents the ultimate cross platform compliance.

All these developments are based on the Semantic  criterion of separating information published on the web from the location (url) of that information so that it can be accessed, matched, corresponded and manipulated by different types of data bases, search applications and still be easily read by humans.

The Semantic Web literally liberates data from its unique location so that it can be used by anyone.

The structuring is simple: knowledge is expressed as descriptive statements, saying some relationship exists between one thing and another. “Jane has a mother, Susan” or “Susan is a mother of Jane”. An enormous amount of people’s knowledge can be expressed in sentences like these. “Part #654 has a price, £6.95.” “George has a city of residence, Washington D.C.” “The United States has a new president, Barack Obama.” This kind of information structuring was standardised for the Web in 1999 as RDF, but the basic technique goes back decades if not millenia.

Add to this recent developments in Natural Language Processing, AI bots that interpret, write and then follow rules from data that they scan, the emergence of linked data patterns/trends from Soc Nets like Twitter as well as breakthroughs in contextual and interpretive search, what is developing is a rich mix of human faculties enhanced by machine efficiency.

Even though the basic data format was determined through a consensus orchestrated by w3c (Berner-Lee’s World Wide Web Consortium), what’s taken this long to implement is not the technological or mathematical innovation but the commercial motive: the uptake of marking up data in RDF is progressing at this rate because the private sector has seen the Semantic web largely as an academic, scholarly initiative, with small commercial incentive.  Which it was, at the onset; but now that the major Search players are begrudgingly acknowledging the advantage of incorporating changes that  actually make searching more productive and easy, by adjusting their means of Search, there is now the opportunity for web publishers everywhere, public, commercial, cultural, academic and otherwise to enhance their web based information to take advantage of more efficient search vehicles.

Google, Twitter and Facebook’s new business models are sometimes referred to as the Attention Economy* (or despairingly as the Attention Deficit Economy), and have  breathed new life into Semantic initiatives. Rather than focus on developing Semantic Search Engines they can own, businesses are beginning to realise that if Semantic markup added to a website makes the data on the website more findable, people are going to use what most efficiently delivers what they want.  The fight to capture people’s attention span is what is common to all business, culture and entertainment in 2009.

Attention Expands

Attention Expands

You can’t sell anything to someone if you can’t  first attract their attention and it’s precisely their attention you need to hold and convert if you expect them to actually take action on what you have to offer.  With the general public now redefined as the multiple channel public, the competition for people’s attention is even greater.  The fact that we have so much choice to lend our attention to means that we can all be allot more discerning and picky about what we pay attention to, hence the term Attention Economy.

The Semantic web is posed to be the keystone of the new economy because as people are given an ever expanding range of choice, they will choose to focus on those channels that are the most useful and deliver the best return on their attention investment.

Relevance rules.

If you would like to  understand your web site’s context within the Semantic Web please contact us here for a feasibility audit of your site that will detail the steps needed to enhance your site structure with meta format data utilising Resource Description Frameworks.

The Semantic Web Will Deliver Them

The Semantic Web Will Deliver Them

*To avoid the numbing effect of needless jargon, the Attention Economy can be understood in the context of its predecessor term: the Knowledge Economy.  To understand this just look at the way the highest brand valued company in the world actually operates:  Google gives away lots of amazing things:  Google maps, Google local business finder, better WP (Gmail), spreadsheet, presentation and database applications that Microsoft, and it charges its users nothing.  Because Google is stupid?  No grasshopper, because Google commands amazing numbers of eyeballs on its pages for amazing amounts of time.  Because Google commands the Attention Economy.  Google is an advertising company  that makes money basically by owning the biggest auction house in the world for its Ad Words and because it commands so much of our attention, it has a high market value. That’s why it bought YouTube, another player in the Attention Economy.   Except now  that Facebook and Twitter are also picking up vast assets (in terms of our attention), Google is maybe for the first time facing a chllange to its domain.  Rumour has it that at 40,000 new joiners a day, Facebook now traffics more than Google.   I’d like somebody to tell us how they measure that.

Also, remember that attention is a limited resource.  There is a limit to what one individual can pay attention to in one day, week, month, lifetime.  Because of the subjective nature of the w3 experience, numbers of  people doesn’t count as much as amount of time dedicated to.

And that is what the players of Google, Facebook and Twitter are playing for: what we pay attention to.

Thank you for yours.