iBench
What we need to do to use iBench
- We should defined how to translate our integration system as the output of iBench (source schema, target schema, ontology, mappings, data source).
- Choose if we want to use user definition primitive (UDP), which enables us to use a starting integration system (or a part of it), that iBench will increase. There exists example of DBLP source schema and of small system.
- By default, iBench adds new relations to source schema. If we use an UDP is will be great to find a way that iBench doesn't modify the source schema too much, in order to use a dedicated data generator for source schema used in the UDP. We can imagine that we increase BSBM integration system using iBench, but we still use the BSBM data generator.
- Because iBench changes the target schema too, we have to think about (i) if we use an UDP, how to adapt existing queries of the benchmark, (ii) else how to create query on the generated target schema, because iBench does not provide
Configurations
The default and possible configurations are:
################################################################################ # IBENCH CONFIGURATION FILE # ################################################################################ # This is a Java properties file, i.e., options are represented as key-value # pairs separated by "=". Comment lines start with a "hash" character (like # this line). ################################################################################ ################################################################################ # Output path prefixes and file names ################################################################################ # relative path for storing metadata related files SchemaPathPrefix=schemaFiles # relative path for storing data files InstancePathPrefix=templateOut # Source and target schema xml file name # FileNames.SourceSchema = sourceSchema.xml # FileNames.TargetSchema = targetSchema.xml # Name for the data generators configuration file # FileNames.SourceInstance=dataGen.tsl # Name for the XML file storing source instance data # FileNames.SourceDocumentName=sourceInst # Name of the metadata XML file storing schemas, constraints, and mappings FileNames.Schemas=metadata.xml ################################################################################ # Number of instances for each primitive type. A primitive is a mapping template # consisting of source and target schema elements, mappings between these schema # fragments, and other metadata. ################################################################################ # Copies a source relation to the target Scenarios.COPY = 0 # Horizontally partition a source relation into multiple target relations Scenarios.HORIZPARTITION = 0 # Copy a relation and create a surrogate key Scenarios.SURROGATEKEY = 0 # Inverse of vertical partitioning, join multiple vertical fragments from the source to create a target relation Scenarios.MERGING = 0 # Inverse of horizontal partitioning, i.e., a union of multiple source fragments Scenarios.FUSION = 0 # Copy a source relation that has a foreign key to itself (e.g., employee with a foreign key mapping each employee to their manager) Scenarios.SELFJOINS = 0 # Vertically partition a source relation Scenarios.VERTPARTITION = 0 # Copies a source relation to the target and adds one or more new attributes without correspondence to the source Scenarios.ADDATTRIBUTE = 0 # Copies a source relation to the target removing one or more new attributes Scenarios.DELATTRIBUTE = 0 # Combination of ADDATTRIBUTE and DELATTRIBUTE Scenarios.ADDDELATTRIBUTE = 0 # Vertically partitions a source relation that corresponds to an IsA relationship Scenarios.VERTPARTITIONISA = 0 # Vertically partitions a source relation that corresponds to an HasA relationship Scenarios.VERTPARTITIONHASA = 0 # Vertically partitions a source relation that corresponds to an N-to-M relationship Scenarios.VERTPARTITIONNTOM = 0 # Inverse of vertical partitioning combined wth ADDATTRIBUTE Scenarios.MERGEADD = 0 # Vertically partitions a source relation that corresponds to several independent star-shaped relationships Scenarios.VERTPARTITIONISAAUTHORITY = 0 ################################################################################ # ConfigOptions, configuration options that control the metadata and data # generation. ConfigOptions.X sets the mean and ConfigOptionsDeviation.X # the variance of a standard distribution. Whenever an option has to be applied # a new value is sampled from this normal distribution. ################################################################################ # Number of attributes per generated relation ConfigOptions.NumOfSubElements = 5 # Number of attributes added by primitives that delete attributes ConfigOptions.NumOfNewAttributes = 1 # Number of attributes deleted by primitives that delete source attributes ConfigOptions.NumOfAttributesToDelete = 1 # Number of relations joined by a mapping (this is also used for other subdivisions by mappings that do not join, e.g., number of horizontal partitions) ConfigOptions.JoinSize = 2 # Number of parameters to functions ConfigOptions.NumOfParamsInFunctions = 1 # Number of attributes in the primary key of a relation ConfigOptions.PrimaryKeySize = 1 # Number of attributes from each input compared by join conditions, e.g., for 2 the condition may be A=C AND B=D ConfigOptions.NumOfJoinAttributes = 2 # Star is 0, chain is 1, random is 2 ConfigOptions.JoinKind = 0 ######################################## # Controls the amount of sharing of # relations across primitives # After NoReuseScenPerc, each primitive has ReuseSourcePerc probability of reusing source relations part of already created primitives #ConfigOptions.ReuseSourcePerc = 0 # After NoReuseScenPerc, each primitive has ReuseSourcePerc probability of reusing target relations part of already created primitives #ConfigOptions.ReuseTargetPerc = 0 # Percentage of primitives that are created without any reuse, for the remaining primitives ReuseSourcePerc is applied #ConfigOptions.NoReuseScenPerc = 100 ######################################## # # Determines how parameters of skolem functions are chosen: #ConfigOptions.SkolemKind = 1 # #ConfigOptions.SourceSkolemPerc = 0 # #ConfigOptions.SourceFDPerc = 0 ######################################## # Generate random inclusion dependencies #ConfigOptions.SourceInclusionDependencyPerc = 0 #ConfigOptions.SourceInclusionDependencyFKPerc = 100 #ConfigOptions.TargetInclusionDependencyPerc = 0 #ConfigOptions.TargetInclusionDependencyFKPerc = 100 # exists is 1 and not exists is 0 #ConfigOptions.SourceCircularInclusionDependency = 0 #ConfigOptions.SourceCircularFK = 0 #ConfigOptions.TargetCircularInclusionDependency = 0 #ConfigOptions.TargetCircularFK = 0 ######################################## # Complexity of VP authority primitives # ConfigOptions.VPAuthorityComplexity = 2 ######################################## # Controls whether source data types are propagated to the target based on correspondences # activate 1 / deactivate 0 # ConfigOptions.PropagateDTsToTarget = 0 # chance (%) of a propagated DT to deviate from the source DT # ConfigOptions.PropagateDTsChanceOfDeviation = 0 # how much a deviated datatype differs from the original one (max 100) # ConfigOptions.PropagateDTsDegreeOfDeviation = 10 ######################################## # Variance for each of the above parameters #ConfigOptionsDeviation.NumOfSubElements = 0 #ConfigOptionsDeviation.NumOfNewAttributes = 0 #ConfigOptionsDeviation.NumOfAttributesToDelete = 0 #ConfigOptionsDeviation.JoinSize = 0 #ConfigOptionsDeviation.NumOfParamsInFunctions = 0 #ConfigOptionsDeviation.PrimaryKeySize = 0 #ConfigOptionsDeviation.NumOfJoinAttributes = 0 #ConfigOptionsDeviation.JoinKind = 0 #ConfigOptionsDeviation.ReuseSourcePerc = 0 #ConfigOptionsDeviation.ReuseTargetPerc = 0 #ConfigOptionsDeviation.NoReuseScenPerc = 0 #ConfigOptionsDeviation.SkolemKind = 0 #ConfigOptionsDeviation.SourceSkolemPerc = 0 #ConfigOptionsDeviation.SourceFDPerc = 0 #ConfigOptionsDeviation.SourceInclusionDependencyPerc = 0 #ConfigOptionsDeviation.SourceInclusionDependencyFKPerc = 0 #ConfigOptionsDeviation.TargetInclusionDependencyPerc = 0 #ConfigOptionsDeviation.TargetInclusionDependencyFKPerc = 0 #ConfigOptionsDeviation.SourceCircularInclusionDependency = 0 #ConfigOptionsDeviation.SourceCircularFK = 0 #ConfigOptionsDeviation.TargetCircularInclusionDependency = 0 #ConfigOptionsDeviation.TargetCircularFK = 0 #ConfigOptionsDeviation.PropagateDTsToTarget = 0 #ConfigOptionsDeviation.PropagateDTsChanceOfDeviation = 0 # how much a deviated datatype differs from the original one (max 100) # ConfigOptions.PropagateDTsDegreeOfDeviation = 10 ################################################################################ # User defined primitives (UDP) specification. ################################################################################ # The number of user defined primitives to be loaded #LoadScenarios.NumScenarios = 1 ######################################## # For each user defined primitive, three parameters prefixed with LoadScenarios.i for the ith UDP # starting from 0. See exampleScenarios/fh.xml for an example of a UDP ######################################## # the TrampXML format file storing the schema, correspondences, and mappings defining the UDP #LoadScenarios.0.File = exampleScenarios/fh.xml # A user-defined name for the UDP #LoadScenarios.0.Name = simpleTest # how many instances should be created #LoadScenarios.0.Inst = 10 ################################################################################ # User defined data types, i.e., value generator # # The user needs to specify the number of such data types. # For each data type specify its name (any name would do), the fully classified # class implementing the value generator (a subclass of ToxGene's ToXgeneCdataGenerator), # and the chance of the value generator being picked for any given attribute in the # generated source schema ################################################################################ # number of user-defined datatypes to be used #DataType.NumDataType = 1 ######################################## # For each user defined data type (UDT), four parameters need to be provided. # A user-defined ToXgeneCdataGenerator needs to implement interface ToXgeneCdataGenerator and # has to have a constructor without parameters. ######################################## # name for the datatype, choosen by user #DataType.0.Name = myemail # class implementing a value generator for this datatype #DataType.0.ClassPath = toxgene.util.cdata.xmark.Emails # the percentage of attributes that should use this DT (this is interpreted as a probability) #DataType.0.Percentage = 60.0 # when loading generated data into a database, use this SQL datatype for the column of this UDT #DataType.0.DBType = TEXT # if the datatype implementation is provided as a separate jar file, the path to this jar file #DataType.0.JarFile = exampleDataTypes/mydt.jar #DataType.1.Name = mybirds #DataType.1.ClassPath = toxgene.util.cdata.xmark.Birds #DataType.1.Percentage = 20.0 #DataType.1.DBType = INT8 ################################################################################ # CSV Data Types # # iBench also supports UDTs which are based on data given a CSV file. The # user specifies a list of CSV files. The values in each column of a file # in this list can be turned in to UDT. iBench reads the bag of values from # the column and interprets it as a probability distribution. # # For instance, if the values in a column are ( 0, 1, 1, 2, 2 ) then values # for the corresponding data type are chosen as follows: # 0 => with probability 0.2 # 1 => with probability 0.4 # 2 => with probability 0.4 # ################################################################################ # number of CSV files to load # CSVDataType.NumFiles = 1 # the path to the ith CSV file # CSVDataType.0.File = zip_codes_states.csv # number of datatypes to define based on the CSV files # CSVDataType.NumDataType = 2 ######################################## # CSV file this UDT is coming from # CSVDataType.0.File = zip_codes_states.csv # Name of the attribute from which the values are read # CSVDataType.0.AttrName = state # the percentage of attributes that should use this DT (this is interpreted as a probability) # CSVDataType.0.Percentage = 20.0 # when loading generated data into a database, use this SQL datatype for the column of this UDT # CSVDataType.0.DBType = TEXT ################################################################################ # Various additional options # Random number generator and max values, DataGenerator and MappingLang ################################################################################ # Seed for the random number generator, used for repeatability RandomSeed = 2 # Number of tuples per relation, if data is generated RepElementCount = 100 # Maximum length of strings created by the data generator MaxStringLength = 100 # Maximum numerical value created by the data generator MaxNumValue = 1000 # Type of data generator (currently only TrampCSV supported) DataGenerator = TrampCSV # Type of query generator to use (Postgres, Perm) QueryGenerator = Postgres # Mapping language used (currently FO tgds or SO tgds are supported) # MappingLanguage = FOtgds # Specify a renamer, a class that renames all generated attribute values (None, AllLowerCase) # AttrRenamer = None # Number of independent tuples (not created by data exchange) generated for each target relation # TargetTableNumRows = 50 # Generate target data as source data exchanged by the generated mappings # ExchangeTargetData = true ################################################################################ # Optional activation/deactivation of output options ################################################################################ # Generate HTML schema OutputOption.HTMLSchemas = false # Generate source data OutputOption.Data = false # Generate target data that is independent of the source data OutputOption.EnableTargetData = false # Generate XMLSchema schemas for the source and target schemas OutputOption.XMLSchemas = false # Generate an HTML description of the source to target mappings OutputOption.HTMLMapping = false # Generate TrampXML file, an XML based metadata format storing the generated schemas, mappings, constraints, etc. OutputOption.TrampXML = true # Generate a Clio conformant mapping file OutputOption.Clio = false ################################################################################ # Optional activation/deactivation of parts of the generated Tramp XML document ################################################################################ # Generate correspondences aka schema matches TrampXMLOutput.Correspondences = true # Generate transformations implementing the mappings (currently only SQL) TrampXMLOutput.Transformations = true # Generate data TrampXMLOutput.Data = false # Generate a connection info (allows Tramp tools to connect to a database, e.g., to load a schema) TrampXMLOutput.ConnectionInfo = false # Generate functional dependencies TrampXMLOutput.FDs = false
One copy
# relative path for storing metadata related files SchemaPathPrefix=schemaFiles/one-copy # Copies a source relation to the target Scenarios.COPY = 1
iBench - Mapping Generation Time: 0.076 seconds iBench - Stats Computation Time: 0.0 seconds Source Schema Stats: 1 (relations) 5 (attributes) Target Schema Stats: 1 (relations) 5 (attributes) Total Schema Stats: 2 (relations) 10 (attributes) Total Mappings Stats: 1 (mappings)
Number of Attributes
There are two parameters to set the number of attributes in relations:
ConfigOptions.NumOfSubElements
setting the mean of number attributes in relationsConfigOptionsDeviation.NumOfSubElements
setting the deviation of the preceding number
I think that there is a lack of parameters for number of attributes. We need a different parameters for source and target schema and parameters to set the maximum and the minimum of attributes in relation.
With Deviation
# relative path for storing metadata related files SchemaPathPrefix=schemaFiles/nb-attributes # Copies a source relation to the target Scenarios.COPY = 100 # Number of attributes per generated relation ConfigOptions.NumOfSubElements = 2 # Deviation of attributes per generated relation ConfigOptionsDeviation.NumOfSubElements = 1
iBench - Mapping Generation Time: 0.31 seconds iBench - Stats Computation Time: 0.001 seconds Source Schema Stats: 100 (relations) 216 (attributes) Target Schema Stats: 100 (relations) 216 (attributes) Total Schema Stats: 200 (relations) 432 (attributes) Total Mappings Stats: 100 (mappings)
The resulting system metadata contains mostly relations with 2 or 3 attributes, never 1 and also target relation with 4 attributes:
<Relation name="start_cp_38_nl0_ce0copy38_38"> <Attr> <Name>fruit_cp_38_nl0_ae0ke0</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>science_cp_38_nl0_ae1</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>desire_cp_38_nl0_ae2</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>last_cp_38_nl0_ae3</Name> <DataType>TEXT</DataType> </Attr> <PrimaryKey> <Attr>fruit_cp_38_nl0_ae0ke0</Attr> </PrimaryKey> </Relation>
Code investigation
The parameter ConfigOptions.NumOfSubElements
is used to set numOfElements
in AbstractScenarioGenerator class, same thing for the deviation. These values are used by CopyScenarioGenerator class, where a constant A
is set using the method getRandomNumberAroundSomething which returns an random value according to numOfElements
and its deviation. Moreover CopyScenarioGenerator forces A
to be greater than 2. Then A
is used to build as number of attributes in source and target relation.
Horizontal partition into multiple relations
During horizontal partition a same mapping is twice repeated with selection of half of the data by their body query
# relative path for storing metadata related files SchemaPathPrefix=schemaFiles/horizontal-partition # Number of attributes per generated relation ConfigOptions.NumOfSubElements = 1 # Horizontally partition a source relation into multiple target relations Scenarios.HORIZPARTITION = 1
We include the resulting systems description below. We notice the following :
- The configuration
ConfigOptions.NumOfSubElements = 1
seems ignored, since each relation have two or three attributes. - The mapping definitions don't contain constant, the selectivity is only defined into transformations i.e. SQL queries.
<Mappings> <Mapping id="M0"> <Uses> <Correspondence ref="C0"/> <Correspondence ref="C1"/> </Uses> <Foreach> <Atom tableref="test_hp_0_nl0_ce0"> <Var>a</Var> <Var>b</Var> <Var>c</Var> </Atom> </Foreach> <Exists> <Atom tableref="test_hp_0_nl0_ce0__HP0FR0_from_0_to_4999"> <Var>b</Var> <Var>c</Var> </Atom> </Exists> </Mapping> <Mapping id="M1"> <Uses> <Correspondence ref="C2"/> <Correspondence ref="C3"/> </Uses> <Foreach> <Atom tableref="test_hp_0_nl0_ce0"> <Var>a</Var> <Var>b</Var> <Var>c</Var> </Atom> </Foreach> <Exists> <Atom tableref="test_hp_0_nl0_ce0__HP0FR1_from_5000_to_9999"> <Var>b</Var> <Var>c</Var> </Atom> </Exists> </Mapping> </Mappings> <Transformations> <Transformation id="T0" creates="test_hp_0_nl0_ce0__HP0FR0_from_0_to_4999"> <Implements> <Mapping ref="M0"/> </Implements> <Code>SELECT x.nut_hp_0_nl0_ae0 AS nut_hp_0_nl0_ae0, x.art_hp_0_nl0_ae1 AS art_hp_0_nl0_ae1 FROM source.test_hp_0_nl0_ce0 AS x WHERE 0<=x.selectorhp0 AND x.selectorhp0<=4999</Code> </Transformation> <Transformation id="T1" creates="test_hp_0_nl0_ce0__HP0FR1_from_5000_to_9999"> <Implements> <Mapping ref="M1"/> </Implements> <Code>SELECT x.nut_hp_0_nl0_ae0 AS nut_hp_0_nl0_ae0, x.art_hp_0_nl0_ae1 AS art_hp_0_nl0_ae1 FROM source.test_hp_0_nl0_ce0 AS x WHERE 5000<=x.selectorhp0 AND x.selectorhp0<=9999</Code> </Transformation> </Transformations>
Vertical partitioning into an IS-A relationship
# relative path for storing metadata related files SchemaPathPrefix=schemaFiles/is-a-partition # Vertically partitions a source relation that corresponds to an IsA relationship Scenarios.VERTPARTITIONISA = 1
<Mappings> <Mapping id="M0"> <Uses> <Correspondence ref="C0"/> <Correspondence ref="C1"/> <Correspondence ref="C2"/> <Correspondence ref="C3"/> <Correspondence ref="C4"/> <Correspondence ref="C5"/> </Uses> <Foreach> <Atom tableref="test_vi_0_nl0_ce0"> <Var>a</Var> <Var>b</Var> <Var>c</Var> <Var>d</Var> <Var>e</Var> </Atom> </Foreach> <Exists> <Atom tableref="society_vi_0_nl0_ce0"> <Var>a</Var> <Var>b</Var> <Var>c</Var> </Atom> <Atom tableref="compare_vi_0_nl0_ce1"> <Var>a</Var> <Var>d</Var> <Var>e</Var> </Atom> </Exists> </Mapping> </Mappings> <Transformations> <Transformation id="T0" creates="society_vi_0_nl0_ce0"> <Implements> <Mapping ref="M0"/> </Implements> <Code>SELECT x.nut_vi_0_nl0_ae0ke0 AS nut_vi_0_nl0_ae0ke0joinattr, x.slope_vi_0_nl0_ae1 AS slope_vi_0_nl0_ae1, x.measure_vi_0_nl0_ae2 AS measure_vi_0_nl0_ae2 FROM source.test_vi_0_nl0_ce0 AS x</Code> </Transformation> <Transformation id="T1" creates="compare_vi_0_nl0_ce1"> <Implements> <Mapping ref="M0"/> </Implements> <Code>SELECT x.nut_vi_0_nl0_ae0ke0 AS nut_vi_0_nl0_ae0ke0ref, x.touch_vi_0_nl0_ae3 AS touch_vi_0_nl0_ae3, x.cheese_vi_0_nl0_ae4 AS cheese_vi_0_nl0_ae4 FROM source.test_vi_0_nl0_ce0 AS x</Code> </Transformation> </Transformations>
Skolem Type
In our RDF integration system, we need that existential variable represents a unknown value which is different for all values taken by no existential variable. It is the reason why we need to put ConfigOptions.SkolemKind = 1
(1 meaning All, I guess).
TODO Delete Attribute Primitive
# relative path for storing metadata related files SchemaPathPrefix=schemaFiles/primitive-del ######################################## # Controls the amount of sharing of # relations across primitives # After NoReuseScenPerc, each primitive has ReuseSourcePerc probability of reusing source relations part of already created primitives ConfigOptions.ReuseSourcePerc = 100 # After NoReuseScenPerc, each primitive has ReuseSourcePerc probability of reusing target relations part of already created primitives ConfigOptions.ReuseTargetPerc = 100 # Percentage of primitives that are created without any reuse, for the remaining primitives ReuseSourcePerc is applied ConfigOptions.NoReuseScenPerc = 0 # Number of attributes per generated relation ConfigOptions.NumOfSubElements = 4 # Number of attributes deleted by primitives that delete source attributes ConfigOptions.NumOfAttributesToDelete = 2 # Combination of ADDATTRIBUTE and DELATTRIBUTE Scenarios.DELATTRIBUTE = 2
<xm:MappingScenario xmlns:xm="org/vagabond/xmlmodel"> <Schemas> <SourceSchema> <Relation name="test_dl_1_nl0_ce0"> <Attr> <Name>nut_dl_1_nl0_ae0ke0</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>slope_dl_1_nl0_ae1</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>measure_dl_1_nl0_ae2</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>touch_dl_1_nl0_ae3</Name> <DataType>TEXT</DataType> </Attr> <PrimaryKey> <Attr>nut_dl_1_nl0_ae0ke0</Attr> </PrimaryKey> </Relation> </SourceSchema> <TargetSchema> <Relation name="cheese_dl_1_nl0_ce0"> <Attr> <Name>nut_dl_1_nl0_ae0ke0</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>slope_dl_1_nl0_ae1</Name> <DataType>TEXT</DataType> </Attr> <PrimaryKey> <Attr>nut_dl_1_nl0_ae0ke0</Attr> </PrimaryKey> </Relation> <Relation name="society_dl_2_nl0_ce0"> <Attr> <Name>nut_dl_1_nl0_ae0ke0</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>slope_dl_1_nl0_ae1</Name> <DataType>TEXT</DataType> </Attr> <PrimaryKey> <Attr>nut_dl_1_nl0_ae0ke0</Attr> </PrimaryKey> </Relation> </TargetSchema> </Schemas> <Correspondences> <Correspondence id="C0"> <From tableref="test_dl_1_nl0_ce0"> <Attr>nut_dl_1_nl0_ae0ke0</Attr> </From> <To tableref="cheese_dl_1_nl0_ce0"> <Attr>nut_dl_1_nl0_ae0ke0</Attr> </To> </Correspondence> <Correspondence id="C1"> <From tableref="test_dl_1_nl0_ce0"> <Attr>slope_dl_1_nl0_ae1</Attr> </From> <To tableref="cheese_dl_1_nl0_ce0"> <Attr>slope_dl_1_nl0_ae1</Attr> </To> </Correspondence> <Correspondence id="C2"> <From tableref="test_dl_1_nl0_ce0"> <Attr>nut_dl_1_nl0_ae0ke0</Attr> </From> <To tableref="society_dl_2_nl0_ce0"> <Attr>nut_dl_1_nl0_ae0ke0</Attr> </To> </Correspondence> <Correspondence id="C3"> <From tableref="test_dl_1_nl0_ce0"> <Attr>slope_dl_1_nl0_ae1</Attr> </From> <To tableref="society_dl_2_nl0_ce0"> <Attr>slope_dl_1_nl0_ae1</Attr> </To> </Correspondence> </Correspondences> <Mappings> <Mapping id="M0"> <Uses> <Correspondence ref="C0"/> <Correspondence ref="C1"/> </Uses> <Foreach> <Atom tableref="test_dl_1_nl0_ce0"> <Var>a</Var> <Var>b</Var> <Var>c</Var> <Var>d</Var> </Atom> </Foreach> <Exists> <Atom tableref="cheese_dl_1_nl0_ce0"> <Var>a</Var> <Var>b</Var> </Atom> </Exists> </Mapping> <Mapping id="M1"> <Uses> <Correspondence ref="C2"/> <Correspondence ref="C3"/> </Uses> <Foreach> <Atom tableref="test_dl_1_nl0_ce0"> <Var>a</Var> <Var>b</Var> <Var>c</Var> <Var>d</Var> </Atom> </Foreach> <Exists> <Atom tableref="society_dl_2_nl0_ce0"> <Var>a</Var> <Var>b</Var> </Atom> </Exists> </Mapping> </Mappings> <Transformations> <Transformation id="T0" creates="cheese_dl_1_nl0_ce0"> <Implements> <Mapping ref="M0"/> </Implements> <Code>SELECT x.nut_dl_1_nl0_ae0ke0 AS nut_dl_1_nl0_ae0ke0, x.slope_dl_1_nl0_ae1 AS slope_dl_1_nl0_ae1 FROM source.test_dl_1_nl0_ce0 AS x</Code> </Transformation> <Transformation id="T1" creates="society_dl_2_nl0_ce0"> <Implements> <Mapping ref="M1"/> </Implements> <Code>SELECT x.nut_dl_1_nl0_ae0ke0 AS nut_dl_1_nl0_ae0ke0, x.slope_dl_1_nl0_ae1 AS slope_dl_1_nl0_ae1 FROM source.test_dl_1_nl0_ce0 AS x</Code> </Transformation> </Transformations> </xm:MappingScenario>
TODO Add Delete Attribute Primitive
# relative path for storing metadata related files SchemaPathPrefix=schemaFiles/primitive-adl # Number of attributes per generated relation ConfigOptions.NumOfSubElements = 2 # Combination of ADDATTRIBUTE and DELATTRIBUTE Scenarios.ADDDELATTRIBUTE = 1
<xm:MappingScenario xmlns:xm="org/vagabond/xmlmodel"> <Schemas> <SourceSchema> <Relation name="test_adl_0_nl0_ce0"> <Attr> <Name>nut_adl_0_nl0_ae0ke0</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>slope_adl_0_nl0_ae1</Name> <DataType>TEXT</DataType> </Attr> <PrimaryKey> <Attr>nut_adl_0_nl0_ae0ke0</Attr> </PrimaryKey> </Relation> </SourceSchema> <TargetSchema> <Relation name="measure_adl_0_nl0_ce0"> <Attr> <Name>nut_adl_0_nl0_ae0ke0</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>touch_adl_0_nl0_ae1</Name> <DataType>TEXT</DataType> </Attr> <PrimaryKey> <Attr>nut_adl_0_nl0_ae0ke0</Attr> </PrimaryKey> </Relation> </TargetSchema> </Schemas> <Correspondences> <Correspondence id="C0"> <From tableref="test_adl_0_nl0_ce0"> <Attr>nut_adl_0_nl0_ae0ke0</Attr> </From> <To tableref="measure_adl_0_nl0_ce0"> <Attr>nut_adl_0_nl0_ae0ke0</Attr> </To> </Correspondence> </Correspondences> <Mappings> <Mapping id="M0"> <Uses> <Correspondence ref="C0"/> </Uses> <Foreach> <Atom tableref="test_adl_0_nl0_ce0"> <Var>a</Var> <Var>b</Var> </Atom> </Foreach> <Exists> <Atom tableref="measure_adl_0_nl0_ce0"> <Var>a</Var> <Var>c</Var> </Atom> </Exists> </Mapping> </Mappings> <Transformations> <Transformation id="T0" creates="measure_adl_0_nl0_ce0"> <Implements> <Mapping ref="M0"/> </Implements> <Code>SELECT x.nut_adl_0_nl0_ae0ke0 AS nut_adl_0_nl0_ae0ke0, 'sk0' || '|' || x.nut_adl_0_nl0_ae0ke0 AS touch_adl_0_nl0_ae1 FROM source.test_adl_0_nl0_ce0 AS x</Code> </Transformation> </Transformations> </xm:MappingScenario>
UDP Source Relation
# relative path for storing metadata related files SchemaPathPrefix=schemaFiles/udp-source-rel ################################################################################ # User defined primitives (UDP) specification. ################################################################################ # The number of user defined primitives to be loaded LoadScenarios.NumScenarios = 1 ######################################## # For each user defined primitive, three parameters prefixed with LoadScenarios.i for the ith UDP # starting from 0. See exampleScenarios/fh.xml for an example of a UDP ######################################## # the TrampXML format file storing the schema, correspondences, and mappings defining the UDP LoadScenarios.0.File = udp/dblp-article.xml # A user-defined name for the UDP LoadScenarios.0.Name = simpleTest # how many instances should be created LoadScenarios.0.Inst = 1 ######################################## # Controls the amount of sharing of # relations across primitives # After NoReuseScenPerc, each primitive has ReuseSourcePerc probability of reusing source relations part of already created primitives ConfigOptions.ReuseSourcePerc = 99 # After NoReuseScenPerc, each primitive has ReuseSourcePerc probability of reusing target relations part of already created primitives ConfigOptions.ReuseTargetPerc = 100 # Percentage of primitives that are created without any reuse, for the remaining primitives ReuseSourcePerc is applied ConfigOptions.NoReuseScenPerc = 0 # Number of attributes per generated relation ConfigOptions.NumOfSubElements = 10 #ConfigOptionsDeviation.NumOfSubElements = 100 Scenarios.VERTPARTITIONISA = 1
<xm:MappingScenario xmlns:xm="org/vagabond/xmlmodel"> <Schemas> <SourceSchema> <Relation name="test_vi_1_nl0_ce0"> <Attr> <Name>nut_vi_1_nl0_ae0ke0</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>slope_vi_1_nl0_ae1</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>measure_vi_1_nl0_ae2</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>touch_vi_1_nl0_ae3</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>cheese_vi_1_nl0_ae4</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>society_vi_1_nl0_ae5</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>compare_vi_1_nl0_ae6</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>branch_vi_1_nl0_ae7</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>demand_vi_1_nl0_ae8</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>great_vi_1_nl0_ae9</Name> <DataType>TEXT</DataType> </Attr> <PrimaryKey> <Attr>nut_vi_1_nl0_ae0ke0</Attr> </PrimaryKey> </Relation> </SourceSchema> <TargetSchema> <Relation name="board_vi_1_nl0_ce0"> <Attr> <Name>nut_vi_1_nl0_ae0ke0joinattr</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>slope_vi_1_nl0_ae1</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>measure_vi_1_nl0_ae2</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>touch_vi_1_nl0_ae3</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>cheese_vi_1_nl0_ae4</Name> <DataType>TEXT</DataType> </Attr> <PrimaryKey> <Attr>nut_vi_1_nl0_ae0ke0joinattr</Attr> </PrimaryKey> </Relation> <Relation name="affect_vi_1_nl0_ce1"> <Attr> <Name>nut_vi_1_nl0_ae0ke0ref</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>society_vi_1_nl0_ae5</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>compare_vi_1_nl0_ae6</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>branch_vi_1_nl0_ae7</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>demand_vi_1_nl0_ae8</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>great_vi_1_nl0_ae9</Name> <DataType>TEXT</DataType> </Attr> </Relation> <Relation name="different_le_1_nl0_ce0_Triple" xmlns:this="org/vagabond/xmlmodel" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <Attr> <Name>different_le_1_nl0_ce0_subject</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>different_le_1_nl0_ce0_property</Name> <DataType>TEXT</DataType> </Attr> <Attr> <Name>different_le_1_nl0_ce0_object</Name> <DataType>TEXT</DataType> </Attr> <PrimaryKey> <Attr>different_le_1_nl0_ce0_subject</Attr> </PrimaryKey> </Relation> <ForeignKey> <From tableref="affect_vi_1_nl0_ce1"> <Attr>nut_vi_1_nl0_ae0ke0ref</Attr> </From> <To tableref="board_vi_1_nl0_ce0"> <Attr>nut_vi_1_nl0_ae0ke0joinattr</Attr> </To> </ForeignKey> </TargetSchema> </Schemas> <Correspondences> <Correspondence id="C0"> <From tableref="test_vi_1_nl0_ce0"> <Attr>nut_vi_1_nl0_ae0ke0</Attr> </From> <To tableref="board_vi_1_nl0_ce0"> <Attr>nut_vi_1_nl0_ae0ke0joinattr</Attr> </To> </Correspondence> <Correspondence id="C1"> <From tableref="test_vi_1_nl0_ce0"> <Attr>slope_vi_1_nl0_ae1</Attr> </From> <To tableref="board_vi_1_nl0_ce0"> <Attr>slope_vi_1_nl0_ae1</Attr> </To> </Correspondence> <Correspondence id="C2"> <From tableref="test_vi_1_nl0_ce0"> <Attr>measure_vi_1_nl0_ae2</Attr> </From> <To tableref="board_vi_1_nl0_ce0"> <Attr>measure_vi_1_nl0_ae2</Attr> </To> </Correspondence> <Correspondence id="C3"> <From tableref="test_vi_1_nl0_ce0"> <Attr>touch_vi_1_nl0_ae3</Attr> </From> <To tableref="board_vi_1_nl0_ce0"> <Attr>touch_vi_1_nl0_ae3</Attr> </To> </Correspondence> <Correspondence id="C4"> <From tableref="test_vi_1_nl0_ce0"> <Attr>cheese_vi_1_nl0_ae4</Attr> </From> <To tableref="board_vi_1_nl0_ce0"> <Attr>cheese_vi_1_nl0_ae4</Attr> </To> </Correspondence> <Correspondence id="C5"> <From tableref="test_vi_1_nl0_ce0"> <Attr>nut_vi_1_nl0_ae0ke0</Attr> </From> <To tableref="affect_vi_1_nl0_ce1"> <Attr>nut_vi_1_nl0_ae0ke0ref</Attr> </To> </Correspondence> <Correspondence id="C6"> <From tableref="test_vi_1_nl0_ce0"> <Attr>society_vi_1_nl0_ae5</Attr> </From> <To tableref="affect_vi_1_nl0_ce1"> <Attr>society_vi_1_nl0_ae5</Attr> </To> </Correspondence> <Correspondence id="C7"> <From tableref="test_vi_1_nl0_ce0"> <Attr>compare_vi_1_nl0_ae6</Attr> </From> <To tableref="affect_vi_1_nl0_ce1"> <Attr>compare_vi_1_nl0_ae6</Attr> </To> </Correspondence> <Correspondence id="C8"> <From tableref="test_vi_1_nl0_ce0"> <Attr>branch_vi_1_nl0_ae7</Attr> </From> <To tableref="affect_vi_1_nl0_ce1"> <Attr>branch_vi_1_nl0_ae7</Attr> </To> </Correspondence> <Correspondence id="C9"> <From tableref="test_vi_1_nl0_ce0"> <Attr>demand_vi_1_nl0_ae8</Attr> </From> <To tableref="affect_vi_1_nl0_ce1"> <Attr>demand_vi_1_nl0_ae8</Attr> </To> </Correspondence> <Correspondence id="C10"> <From tableref="test_vi_1_nl0_ce0"> <Attr>great_vi_1_nl0_ae9</Attr> </From> <To tableref="affect_vi_1_nl0_ce1"> <Attr>great_vi_1_nl0_ae9</Attr> </To> </Correspondence> <Correspondence id="different_le_1_nl0_ce0_c1" xmlns:this="org/vagabond/xmlmodel" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <From tableref="different_le_1_nl0_ce0_DArticle"> <Attr>different_le_1_nl0_ce0_pid</Attr> </From> <To tableref="different_le_1_nl0_ce0_Triple"> <Attr>different_le_1_nl0_ce0_subject</Attr> </To> </Correspondence> <Correspondence id="different_le_1_nl0_ce0_c2" xmlns:this="org/vagabond/xmlmodel" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <From tableref="different_le_1_nl0_ce0_DArticle"> <Attr>different_le_1_nl0_ce0_title</Attr> </From> <To tableref="different_le_1_nl0_ce0_Triple"> <Attr>different_le_1_nl0_ce0_property</Attr> </To> </Correspondence> <Correspondence id="different_le_1_nl0_ce0_c3" xmlns:this="org/vagabond/xmlmodel" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <From tableref="different_le_1_nl0_ce0_DArticle"> <Attr>different_le_1_nl0_ce0_pages</Attr> </From> <To tableref="different_le_1_nl0_ce0_Triple"> <Attr>different_le_1_nl0_ce0_object</Attr> </To> </Correspondence> </Correspondences> <Mappings> <Mapping id="M0"> <Uses> <Correspondence ref="C0"/> <Correspondence ref="C1"/> <Correspondence ref="C2"/> <Correspondence ref="C3"/> <Correspondence ref="C4"/> <Correspondence ref="C5"/> <Correspondence ref="C6"/> <Correspondence ref="C7"/> <Correspondence ref="C8"/> <Correspondence ref="C9"/> <Correspondence ref="C10"/> </Uses> <Foreach> <Atom tableref="test_vi_1_nl0_ce0"> <Var>a</Var> <Var>b</Var> <Var>c</Var> <Var>d</Var> <Var>e</Var> <Var>f</Var> <Var>g</Var> <Var>h</Var> <Var>i</Var> <Var>j</Var> </Atom> </Foreach> <Exists> <Atom tableref="board_vi_1_nl0_ce0"> <Var>a</Var> <Var>b</Var> <Var>c</Var> <Var>d</Var> <Var>e</Var> </Atom> <Atom tableref="affect_vi_1_nl0_ce1"> <Var>a</Var> <Var>f</Var> <Var>g</Var> <Var>h</Var> <Var>i</Var> <Var>j</Var> </Atom> </Exists> </Mapping> </Mappings> <Transformations> <Transformation id="T0" creates="board_vi_1_nl0_ce0"> <Implements> <Mapping ref="M0"/> </Implements> <Code>SELECT x.nut_vi_1_nl0_ae0ke0 AS nut_vi_1_nl0_ae0ke0joinattr, x.slope_vi_1_nl0_ae1 AS slope_vi_1_nl0_ae1, x.measure_vi_1_nl0_ae2 AS measure_vi_1_nl0_ae2, x.touch_vi_1_nl0_ae3 AS touch_vi_1_nl0_ae3, x.cheese_vi_1_nl0_ae4 AS cheese_vi_1_nl0_ae4 FROM source.test_vi_1_nl0_ce0 AS x</Code> </Transformation> <Transformation id="T1" creates="affect_vi_1_nl0_ce1"> <Implements> <Mapping ref="M0"/> </Implements> <Code>SELECT x.nut_vi_1_nl0_ae0ke0 AS nut_vi_1_nl0_ae0ke0ref, x.society_vi_1_nl0_ae5 AS society_vi_1_nl0_ae5, x.compare_vi_1_nl0_ae6 AS compare_vi_1_nl0_ae6, x.branch_vi_1_nl0_ae7 AS branch_vi_1_nl0_ae7, x.demand_vi_1_nl0_ae8 AS demand_vi_1_nl0_ae8, x.great_vi_1_nl0_ae9 AS great_vi_1_nl0_ae9 FROM source.test_vi_1_nl0_ce0 AS x</Code> </Transformation> </Transformations> </xm:MappingScenario>