Printer Services Mail Archive: PS> FW: [relax-ng-comment] In

PS> FW: [relax-ng-comment] InstanceToSchema 1.0

From: McDonald, Ira (imcdonald@sharplabs.com)
Date: Sun Feb 16 2003 - 16:38:09 EST

  • Next message: McDonald, Ira: "PS> FW: [URI, IRI, and Schemes]"

    Hi,

    Folks now prototyping PSI might want to play with the tool below.

    Cheers,
    - Ira McDonald
      High North Inc

    -----Original Message-----
    From: Didier DEMANY [mailto:didier.demany@xmloperator.net]
    Sent: Thursday, February 13, 2003 3:08 PM
    To: relax-ng-comment@lists.oasis-open.org
    Subject: [relax-ng-comment] InstanceToSchema 1.0

    Hi,

    I am pleased to announce InstanceToSchema 1.0 [1]

    InstanceToSchema is a RELAX NG schema generator from XML instances.

    It is a command line tool, written in java. It needs J2SE 1.3 or 1.4 and a
    JAXP compliant SAX parser for running.

    InstanceToSchema is developed inside the xmloperator project [2] and shares
    its BSD style license but is packaged and can be used independently from
    the XML editor.

    The software is based on pattern categories. A pattern category represents
    a set of RELAX NG patterns. The tool work consists in building for each
    element name a pattern category that is compatible with all the input XML
    instances and is as precise as possible.

    The following pattern category types are implemented :

     * An EmptyPatternCategory represents contents with no element. There may
    be only attributes and/or text.

     * An (OptionalRepeatable)ElementPatternCategory represents contents with
    one element or several elements but with the same name. There may also be
    attributes and/or texts.

     * A GroupPatternCategory represents ordered contents or choice between
    ordered contents. There may also be attributes and/or texts.

     * An InterleavePatternCategory represents unordered contents. Some element
    names may appear several times, some others may not. There may also be
    attributes and/or texts.

    All these pattern categories consider elements and attributes as
    independent. However the tool framework doesn't require that. New pattern
    categories could correlate elements and attributes. Another thing the tool
    does not is inferencing datatypes.

    The tool is suitable for processing large documents. It uses only one SAX
    parsing pass. The required memory space depends on the element name count
    and the complexity of patterns, not the document size.

    The set of pairs (element name, pattern category) is translated to a RELAX
    NG simple syntax data model (the same is used by the XML editor), which is
    converted to a more readable full syntax and writed out with indentation.

    A typical use case consists to obtain a description of the structure of one
    or several (combined) XML files. From my point of view, such a schema is
    not suitable for validating or for guiding editing some document.

    I hope that this tool can be usefull or incite some developer to do better.
    I would welcome any comment.

    Regards,

    Didier Demany
    didier.demany@xmloperator.net
    The_xmloperator_project

    [1] http://www.xmloperator.net/i2s/

    [2] http://www.xmloperator.net/

    ----------------------------------------------------------------
    To subscribe or unsubscribe from this elist use the subscription
    manager: <http://lists.oasis-open.org/ob/adm.pl>



    This archive was generated by hypermail 2b29 : Sun Feb 16 2003 - 16:38:48 EST