Norwegian standards based on NoSQL
Interview with Software Architect consultant Vidar Ingebretsen, Standards Norway
The quest for better search capabilities and XML language led Standard Norway to replace its traditional SQL database with the NoSQL database Mark Logic. Earmark time and resources for skills retraining, advises the Standard Norway project manager. Otherwise you risk getting into trouble.
Standard Norway – the Norwegian equivalent of Danish Standards – is an organisation devoted to developing and publishing standards. This means that they help to streamline product development across national borders so as not to produce several near-identical versions of the same product. This is not just a good idea for businesses; it also benefits consumers because standards ensure documentation that a product complies with specific requirements – for example in the environmental area. The many thousands of standards are sold by Standard Norway through their sales company Standard Online, either directly to the Norwegian market or as ‘adapted’ variants, where international standards have been published as Norwegian standards. The standards are published online in a solution where for several years the information was stored as PDF files. This means that the users could only search the document metadata when they searched for the desired standard.
"PDF is a very inflexible format in general," says Vidar Ingebretsen, a Software Architect consultant at Standard
"So in 2013 we began to look around for a new system that could support XML. XML allows advanced searches that support HTML, so that, for example, a user can read a standard online with hyperlinking – - to other standards and also within the same document. And it supports the production of e-books, allowing the creation of an electronic version of a standard. XML makes all this possible,” says Vidar Ingebretsen, adding that traditional databases, such as those Standard Norway originally used, did notsupport XML particularly well, which is why they went in search of a NoSQL database.
New possibilities demand new skills
They found the NoSQL database by looking at a solution used by the International Organization for Standardization – better known as ISO – which is the global standard-setting organisation with 162 member countries. ISO had recently faced many of the same issues as Standard Norway, and they had chosen a supplier and a NoSQL database solution that they believed was capable of handling the task at hand: its name was MarkLogic.
"So we quickly turned to MarkLogic as well. Their database has excellent search functions, supports XML well, and in addition we can store all binary data in the system, which means that we can continue to store all PDF files and Word documents in the same database," says Vidar Ingebretsen. He explains that MarkLogic was responsible for training Standard Norway employees in the use of the MarkLogic database, and also had a fixed team of consultants attached to the project, so that Standard Norway always had access to superusers of the system. After several months of analysis, the project was initiated in January 2014 and as of April 2014 it was in full swing handling the migration of data from the old to the new system.
Was there anything that surprised you with regard to the project?
"A lot of scripting has to be done in the database, and there is a lot of very functional programming, which is a different kind of programming to what I'm used to. So the learning curve was somewhat steeper than I would have thought in terms of using NoSQL. But we can also see that it will be good because we have a very close partnership with MarkLogic. Without that, it would have been a huge challenge. There's a great deal that is different to traditional SQL, and although the system offers many options, it requires a certain amount of training to take advantage of them," says Vidar Ingebretsen.
Good business case
Although the decision to switch from SQL to NoSQL was made before Vidar Ingebretsen joined Standard Norway, he is convinced it was a business-driven and not an IT-driven decision.
"ISO has set a market trend by switching to NoSQL, , and many have followed in their footsteps. With ISO, we've seen better data quality, better search capabilities, e-book capabilities and better link options that give customers a much better user experience. The expectation is, of course, that that this will ultimately lead to increased sales in the webshop," says Vidar Ingebretsen, adding:
"We also expect that the optimisation of internal business processes can lead to significant savings. Currently, much of the production of the standards is based on manual processes, which can now be automated with the new system. For example, this includes quality assurance of the PDF files, where in future we will use a system called Pitstop to ensure quality control of the data and rectify PDFs by adding fonts that do not exist in the file," explains Vidar Ingebretsen.
Although he generally describes the NoSQL project as successful, one, there have also been challenges along the way.
"We especially had problems with migrating data from the old system to the new and connecting the data in a good way. This is primarily because the data quality in the old system was worse than we had expected. The complexity of the solution has also proved to be a challenge. It requires excellent XML skills to embark on a project such as this. We had these skills via MarkLogic’s consultants, as well as by upgrading and further developing our own resources’ expertise. That was a necessity, and you need to be aware of this if you're considering embarking on a similar project," concludes Vidar Ingebretsen.
Highlights of MarkLogic
- Supports ACID transactions, horizontal scaling, indexing in real-time and disaster recovery
- Supports searches across different data types – text, images, date/time, geospatial data and currencies – from many different data sources
- Ability to run MarkLogic directly on the Hadoop Distributed File System (HDFS) and move data between MarkLogic and Hadoop within the applications
- Supports Analytics and Business Intelligence in real time
- Good tools and APIs for rapid application development
About Standard Norway's online solution
Standard Norway's online solution is based on the e-commerce solution EPiServer Commerce with an advanced product database from MarkLogic as the foundation. The product database supplies both metadata and product content to EPiServer Commerce. In addition to single purchases of standards via www.standard.no, the company is also testing a subscription solution based on the security solution in MarkLogic.