Testing schematron-nokogiri GEM Part 2

This post demonstrates using Nokogiri::XML::Builder to create a XML document, and schematron-nokogiri to validate the XML document created. Validation schema is built following literate programming principles.

1 Install Test Environment

Create a Gemfile with the following content:

source "https://rubygems.org"
gem 'schematron-nokogiri'

and run

bundle install

2 Define Data Model

2.1 Define Logical Model

Logical model is a Ruby hash with entity names as keys and an array of column hashes with attributes ‘name’, ‘domain’ and ‘description’.

logicalModel = {

  User: [
    { 
      name: "name",
      domain: "name", 
      description: "name of user",
    },
    { 
      name: "lastname",
      domain: "lastname",
      description: "lastname of user",
    },
    { 
      name: "email",
      domain: "email",
      description: "email address of user",
    },
  ],
  
  Company: [
    { 
      name: "name",
      domain: "co_name", 
      description: "name of companry",
    },
    { 
      name: "email",
      domain: "email",
      description: "email address of user",
    },
  ]
}

2.2 Define Physical Model

Physical model is a Ruby hash with table names as keys and an array of column hashes referring to logical model attribute.

physicalModel = {

    Users: [
      {
        entity: "User",
        entity_name: "name",
        name: "user_name",
      },
      {
        entity: "Company",
        entity_name: "name",
        name: "Company_name",
      },
    ]
}

3 Create schema_package XML Using Nokogiri::XML::Builder

3.1 Column Element in schema_package

Lambdas to create XML elements for logical or physical model columns

# Create 'xml' entry in namespace 'ns' for logical model column 'col'
#
# For example
#     <lm:col domain="name" description="name of user">
#       <!-- yield to create <name> element -->
#     </lm:col>

logicalModelColXml = -> ( ns, xml, col, &b ) do
  xml[ns].col( :domain=>col[:domain], :description=>col[:description] ) do
    # yield name
    b.call if b
  end
end

physicalModelColXml = -> ( ns, xml, col, &b ) do
  xml[ns].col( :entity=>col[:entity], :entity_name=>col[:entity_name] ) do
    # yield name
    b.call if b
  end
end

3.2 Hash Visitor in schema_package

Methods buildXmlTable and buildXml to iterate data model hashes:

# Build 'xml' entry for 'tableName' in namespace 'ns' using 'colXml'
# to build xml for array 'tableCols'.
#
# For examample 'User' in namespace 'lm':
#
# <lm:table>
#   <lm:name name="User">User</lm:name>
#   <lm:cols>
#      <!-- result of iteratating 'colXml' --> 
#   </lm:cols>
# </lm:table>

def buildXmlTable ns, xml, tableName, tableCols, colXml
  xml[ns].table do
    xml[ns].name tableName, :name=> tableName 
    xml[ns].cols do 
      tableCols.map do |col|
        # e.g.
        # xml[ns].col( :domain=>col[:domain], :description=>col[:description] ) do
	#   xml[ns].name col[:name]
	# end
        colXml.call(ns, xml, col)  do
          xml[ns].name col[:name]
        end
      end
    end
  end
end

# Build 'xml' in namespace 'ns' for table definitions in 'tables' with
# xml for columns created using lambda 'colXml'.
def buildXml ns, xml, tables, colXml
  tables.keys.map { |table| buildXmlTable( ns, xml, table, tables[table], colXml ) }
end

3.3 Main schema_package

Main to create schema_package for logical and physical data models

# Namespaces in the XML document
namespaces = {
  "xmlns:lm" => "http://www.example.com/logical_model",
  "xmlns:pm" => "http://www.example.com/physical_model",
}

# Create 'schema_package' XML for 'logicalModel', and 'physicalModel',
# with column elements defined by lambdas 'logicalModelColXml',
# 'physicalModelColXml' respectively.
xml = Nokogiri::XML::Builder.new { |xml| 
  xml.schema_package( namespaces ) do
    buildXml "lm",  xml, logicalModel, logicalModelColXml
    buildXml "pm",  xml, physicalModel, physicalModelColXml
  end
}.to_xml

3.4 The Composite schema_package Builder

Putting it all together,

require 'nokogiri'

logicalModel = {

  User: [
    { 
      name: "name",
      domain: "name", 
      description: "name of user",
    },
    { 
      name: "lastname",
      domain: "lastname",
      description: "lastname of user",
    },
    { 
      name: "email",
      domain: "email",
      description: "email address of user",
    },
  ],
  
  Company: [
    { 
      name: "name",
      domain: "co_name", 
      description: "name of companry",
    },
    { 
      name: "email",
      domain: "email",
      description: "email address of user",
    },
  ]
}
physicalModel = {

    Users: [
      {
        entity: "User",
        entity_name: "name",
        name: "user_name",
      },
      {
        entity: "Company",
        entity_name: "name",
        name: "Company_name",
      },
    ]
}
# Create 'xml' entry in namespace 'ns' for logical model column 'col'
#
# For example
#     <lm:col domain="name" description="name of user">
#       <!-- yield to create <name> element -->
#     </lm:col>

logicalModelColXml = -> ( ns, xml, col, &b ) do
  xml[ns].col( :domain=>col[:domain], :description=>col[:description] ) do
    # yield name
    b.call if b
  end
end

physicalModelColXml = -> ( ns, xml, col, &b ) do
  xml[ns].col( :entity=>col[:entity], :entity_name=>col[:entity_name] ) do
    # yield name
    b.call if b
  end
end

# Build 'xml' entry for 'tableName' in namespace 'ns' using 'colXml'
# to build xml for array 'tableCols'.
#
# For examample 'User' in namespace 'lm':
#
# <lm:table>
#   <lm:name name="User">User</lm:name>
#   <lm:cols>
#      <!-- result of iteratating 'colXml' --> 
#   </lm:cols>
# </lm:table>

def buildXmlTable ns, xml, tableName, tableCols, colXml
  xml[ns].table do
    xml[ns].name tableName, :name=> tableName 
    xml[ns].cols do 
      tableCols.map do |col|
        # e.g.
        # xml[ns].col( :domain=>col[:domain], :description=>col[:description] ) do
	#   xml[ns].name col[:name]
	# end
        colXml.call(ns, xml, col)  do
          xml[ns].name col[:name]
        end
      end
    end
  end
end

# Build 'xml' in namespace 'ns' for table definitions in 'tables' with
# xml for columns created using lambda 'colXml'.
def buildXml ns, xml, tables, colXml
  tables.keys.map { |table| buildXmlTable( ns, xml, table, tables[table], colXml ) }
end


# Namespaces in the XML document
namespaces = {
  "xmlns:lm" => "http://www.example.com/logical_model",
  "xmlns:pm" => "http://www.example.com/physical_model",
}

# Create 'schema_package' XML for 'logicalModel', and 'physicalModel',
# with column elements defined by lambdas 'logicalModelColXml',
# 'physicalModelColXml' respectively.
xml = Nokogiri::XML::Builder.new { |xml| 
  xml.schema_package( namespaces ) do
    buildXml "lm",  xml, logicalModel, logicalModelColXml
    buildXml "pm",  xml, physicalModel, physicalModelColXml
  end
}.to_xml

and running results to XML document

<?xml version="1.0"?>
<schema_package xmlns:lm="http://www.example.com/logical_model" xmlns:pm="http://www.example.com/physical_model">
  <lm:table>
    <lm:name name="User">User</lm:name>
    <lm:cols>
      <lm:col domain="name" description="name of user">
        <lm:name>name</lm:name>
      </lm:col>
      <lm:col domain="lastname" description="lastname of user">
        <lm:name>lastname</lm:name>
      </lm:col>
      <lm:col domain="email" description="email address of user">
        <lm:name>email</lm:name>
      </lm:col>
    </lm:cols>
  </lm:table>
  <lm:table>
    <lm:name name="Company">Company</lm:name>
    <lm:cols>
      <lm:col domain="co_name" description="name of companry">
        <lm:name>name</lm:name>
      </lm:col>
      <lm:col domain="email" description="email address of user">
        <lm:name>email</lm:name>
      </lm:col>
    </lm:cols>
  </lm:table>
  <pm:table>
    <pm:name name="Users">Users</pm:name>
    <pm:cols>
      <pm:col entity="User" entity_name="name">
        <pm:name>user_name</pm:name>
      </pm:col>
      <pm:col entity="Company" entity_name="name">
        <pm:name>Company_name</pm:name>
      </pm:col>
    </pm:cols>
  </pm:table>
</schema_package>

4 Schematron Schema to Validate schema_package

4.1 Logical Model Top Validation

Logical model rules

    <sch:rule context="schema_package">
      <sch:assert test="count(lm:table)">Schema package MUST define at one 'lm:table'</sch:assert>
    </sch:rule>

4.2 Logical Model Table Validation

Validate Logical Model table

<sch:rule context="lm:table">
  <sch:assert test="count(.//lm:col) > 0">Logical model 'lm:table' MUST define at least one 'lm:col'</sch:assert>
  <sch:assert test="count(lm:name[text()]) = 1">Logical model 'lm:table' MUST define exacttly one non-empty 'lm:name'</sch:assert>
</sch:rule>

4.3 Logical Model Column Validation

Validate Logical Model columns

    <sch:rule context="lm:col">
      <sch:assert test="@domain">Logical model 'lm:col' MUST define at @domain attribute'</sch:assert>
      <sch:assert test="@description">Logical model 'lm:col' MUST define at @description attribute'</sch:assert>
    </sch:rule>

4.4 Physical Model Top Validation

    <sch:rule context="schema_package">
      <sch:assert test="count(pm:table)">Physical model MUST define at one 'pm:table'</sch:assert>
    </sch:rule>

4.5 Physical Model Table Column Validation

Rules to validate physical model table columns elements

<sch:rule context="pm:table//pm:col">
  <sch:assert test="@entity">Physical model 'pm:col' MUST define @entity attribute'</sch:assert>
  <sch:assert test="count(//lm:table/lm:name[@name=current()/@entity ]) = 1">Physical model 'pm:col' @entity must reference an existing logical model table</sch:assert>
  <sch:assert test="@entity_name">Physical model 'pm:col' MUST define @entity_name attribute'</sch:assert>
</sch:rule>

4.6 The Composite Schematron schema_package Validator

The composite Schematron schema to validate schema_package XML documents

<?xml version="1.0" encoding="UTF-8"?>
<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron">
  <sch:ns uri="http://www.example.com/logical_model" prefix="lm"/>
  <sch:ns uri="http://www.example.com/physical_model" prefix="pm"/>
  <sch:pattern name="Logical model">
        <sch:rule context="schema_package">
          <sch:assert test="count(lm:table)">Schema package MUST define at one 'lm:table'</sch:assert>
        </sch:rule>
    <sch:rule context="lm:table">
      <sch:assert test="count(.//lm:col) > 0">Logical model 'lm:table' MUST define at least one 'lm:col'</sch:assert>
      <sch:assert test="count(lm:name[text()]) = 1">Logical model 'lm:table' MUST define exacttly one non-empty 'lm:name'</sch:assert>
    </sch:rule>
        <sch:rule context="lm:col">
          <sch:assert test="@domain">Logical model 'lm:col' MUST define at @domain attribute'</sch:assert>
          <sch:assert test="@description">Logical model 'lm:col' MUST define at @description attribute'</sch:assert>
        </sch:rule>
  </sch:pattern>
  <sch:pattern name="Physical model">
        <sch:rule context="schema_package">
          <sch:assert test="count(pm:table)">Physical model MUST define at one 'pm:table'</sch:assert>
        </sch:rule>
    <sch:rule context="pm:table//pm:col">
      <sch:assert test="@entity">Physical model 'pm:col' MUST define @entity attribute'</sch:assert>
      <sch:assert test="count(//lm:table/lm:name[@name=current()/@entity ]) = 1">Physical model 'pm:col' @entity must reference an existing logical model table</sch:assert>
      <sch:assert test="@entity_name">Physical model 'pm:col' MUST define @entity_name attribute'</sch:assert>
    </sch:rule>
  </sch:pattern>
</sch:schema>

5 Validate schema_package

Ruby to validate schema_package in file datamodels.xml using Schematron schema in schema-1.stron -file:

require 'schematron-nokogiri'

# load schematron schema
stron_doc = Nokogiri::XML File.open "schema-1.stron" 

# make it a SchematronNokogiri -object
stron = SchematronNokogiri::Schema.new stron_doc

# load a XML document
doc_name = "datamodels.xml" 
xml_doc = Nokogiri::XML File.open doc_name

# validate the XML document - output errors (or reports)
stron.validate(xml_doc).each do |error| 
  puts "#{doc_name}[#{error[:rule_type]}@#{error[:line]}]: #{error[:message]}"
end

6 Version information

Tested using GEM versions

Gems included by the bundle:
  * bundler (1.15.4)
  * mini_portile2 (2.3.0)
  * nokogiri (1.8.2)
  * schematron-nokogiri (0.0.3)

on Ruby version

ruby 2.3.1p112 (2016-04-26 revision 54768) [x86_64-linux]

7 Fin

This is the second part of a blog series covering Schematron XML validation language. Ref. the previous blog post.

One thought on “Testing schematron-nokogiri GEM Part 2

Leave a comment