利用Dom4j操作XML

            Dom4j

 

Dom4j也足以充足便利就XML文档的创导、元素的改动、文档的查询遍历等,但dom4j稍比jdom复杂一点,可是在大片文档的情下dom4j的习性要不jdom好。

Parsing XML

  One of the first things you’ll probably want to do is to parse an XML document of some kind. This is easy to do in <dom4j>. The following code demonstrates how to this.

解析:

  首先你或许会合记挂要错过分析各样XML文件,在dom4j中立时是大轻之。以下代码演示了这样分析。

import java.net.URL;

import org.dom4j.Document;

import org.dom4j.DocumentException;

import org.dom4j.io.SAXReader;

public class Foo {

    public Document parse(URL url) throws DocumentException {

        SAXReader reader = new SAXReader();

        Document document = reader.read(url);

        return document;

    }

}

 

# 准备

Using Iterators

  A document can be navigated using a variety of methods that return standard Java Iterators. For example

动用迭代器

  使用不同的道来操作文件为获java的正儿八经迭代器,例如:

public void bar(Document document) throws DocumentException {

    Element root = document.getRootElement();

    // iterate through child elements of root

    for (Iterator<Element> it = root.elementIterator(); it.hasNext();) {

        Element element = it.next();

        // do something

    }

    // iterate through child elements of root with element name "foo"

    for (Iterator<Element> it = root.elementIterator("foo"); it.hasNext();) {

        Element foo = it.next();

        // do something

    }

    // iterate through attributes of root

    for (Iterator<Attribute> it = root.attributeIterator(); it.hasNext();) {

        Attribute attribute = it.next();

        // do something

    }

 }

 

 

第一,提供有关的jar包

Powerful Navigation with XPath

  In <dom4j> XPath expressions can be evaluated on the Document or on any Node in the tree (such as Attribute, Element orProcessingInstruction). This allows complex navigation throughout the document with a single line of code. For example

 

XPath强大的导航能力

  Dom4j 中的XPath表明式可以在文档或者树之别节点(例如属性,元素或经过指令)上求值,它采取一行代码在整个文档中举行复杂的领航,例如:

public void bar(Document document) {

    List<Node> list = document.selectNodes("//foo/bar");

    Node node = document.selectSingleNode("//foo/bar/author");

    String name = node.valueOf("@name");

}

 

 

For example if you wish to find all the hypertext links in an XHTML document the following code would do the trick.

举个例,假若你想假若找到XHTML文档中装有的超文本链接,可以依据以下的代码。

 

public void findLinks(Document document) throws DocumentException { 

    List<Node> list = document.selectNodes("//a/@href");

    for (Iterator<Node> iter = list.iterator(); iter.hasNext();) {

        Attribute attribute = (Attribute) iter.next();

        String url = attribute.getValue();

    }

}

 

 

If you need any help learning the XPath language we highly recommend the Zvon tutorial which allows you to learn by example.

而您需协助上XPath语言,咱们强烈推荐Zvon教程,它同意你通过演示学习

 

Dom4j jar包下载:

Fast Looping

  If you ever have to walk a large XML document tree then for performance we recommend you use the fast looping method which avoids the cost of creating an Iterator object for each loop. For example

高速循环

  假诺您要遍历大型XML文档树,那么以增强性,大家指出乃运便捷循环方法,这样可以制止为每个循环创造迭代器对象的本。例如

public void treeWalk(Document document) {

    treeWalk(document.getRootElement());

}

public void treeWalk(Element element) {

    for (int i = 0, size = element.nodeCount(); i < size; i++) {

        Node node = element.node(i);

        if (node instanceof Element) {

            treeWalk((Element) node);

        }

        else {

            // do something…

        }

    }

}

 

http://sourceforge.net/projects/dom4j/files/dom4j-2.0.0-ALPHA-2/

Creating a new XML document

  Often in <dom4j> you will need to create a new document from scratch. Here’s an example of doing that.

新建一个XML文档

  以dom4j中你以汇合不时从头先河成立一个初的文档,下边就新建文档的例证。

import org.dom4j.Document;

import org.dom4j.DocumentHelper;

import org.dom4j.Element;

public class Foo {

     public Document createDocument() {

        Document document = DocumentHelper.createDocument();

        Element root = document.addElement("root");

         Element author1 = root.addElement("author")

            .addAttribute("name", "James")

            .addAttribute("location", "UK")

            .addText("James Strachan");



        Element author2 = root.addElement("author")

            .addAttribute("name", "Bob")

            .addAttribute("location", "US")

            .addText("Bob McWhirter");



        return document;

    }

}

 

 

 

jaxen jar下载:

Writing a document to a file

  A quick and easy way to write a Document (or any Node) to a Writer is via the write() method.

拿文档写及文件被

  向写入器写副文档(或另外节点)的迅猛便捷方法是透过write()方法。

FileWriter out = new FileWriter("foo.xml");

document.write(out);

out.close();

 

  If you want to be able to change the format of the output, such as pretty printing or a compact format, or you want to be able to work with Writer objects or OutputStream objects as the destination, then you can use the XMLWriter class.

  虽然想会转移输出的格式,比如可以的打印或困难凑格式,或者想能利用Writer对象或OutputStream对象作为靶子,那么可采纳XMLWriter类。

 

import org.dom4j.Document;

import org.dom4j.io.OutputFormat;

import org.dom4j.io.XMLWriter;



public class Foo {

    public void write(Document document) throws IOException {

        // lets write to a file

        try (FileWriter fileWiter = new FileWriter("output.xml")) {

            XMLWriter writer = new XMLWriter(fileWriter);

            writer.write( document );

            writer.close();

        }



        // Pretty print the document to System.out

        OutputFormat format = OutputFormat.createPrettyPrint();

        writer = new XMLWriter(System.out, format);

        writer.write( document );



        // Compact format to System.out

        format = OutputFormat.createCompactFormat();

        writer = new XMLWriter(System.out, format);

        writer.write(document);

        writer.close();

    }

}

 

 

http://repo1.maven.org/maven2/jaxen/jaxen/1.1.1/jaxen-1.1.1.jar

Converting to and from Strings

  If you have a reference to a Document or any other Node such as an Attribute or Element, you can turn it into the default XML text via the asXML() method.

字符串的易

  倘使您有指向文档或其他其他节点(如属性或因素)的援,您可以经过asXML()方法以其变为默认的XML文本。

 

 

Document document = …;

String text = document.asXML();

 

  If you have some XML as a String you can parse it back into a Document again using the helper method DocumentHelper.parseText()

  倘若来局部XML作为字符串,可以利用帮忙方法DocumentHelper.parseText()将该分析回文档

 

String text = "<person> <name>James</name> </person>";

Document document = DocumentHelper.parseText(text);

 

跟dom4j依赖或相关的jar:

Transforming a Document with XSLT

  Applying XSLT on a Document is quite straightforward using the JAXP API from Oracle. This allows you to work against any XSLT engine such as Xalan or Saxon. Here is an example of using JAXP to create a transformer and then applying it to a Document.

 

行使XSLT转换文档

  使用Oracle的JAXP API对文档应用XSLT卓殊简单。这允许而对抗任何XSLT引擎,比如Xalan或Saxon。上边是一个运JAXP创设转换器并拿其下及文档的言传身教。

 

import javax.xml.transform.Transformer;

import javax.xml.transform.TransformerFactory;

import org.dom4j.Document;

import org.dom4j.io.DocumentResult;

import org.dom4j.io.DocumentSource;



public class Foo {



   public Document styleDocument(Document document, String stylesheet) throws Exception {

        // load the transformer using JAXP

        TransformerFactory factory = TransformerFactory.newInstance();

        Transformer transformer = factory.newTransformer(new StreamSource(stylesheet));



        // now lets style the given document

        DocumentSource source = new DocumentSource(document);

        DocumentResult result = new DocumentResult();

        transformer.transform(source, result);



        // return the transformed document

        Document transformedDoc = result.getDocument();

        return transformedDoc;

    }

}

 

 

 

 

      XMLPULL

要害的特征:

  1、simple interface 接口单一

     2、implementation independent实现独立

  3、ease of use 操作简易    

    •   START DOCUMENT
    •   START_TAG
    •   TEXT
    •   END_TAG
    •   END_DOCUMENT

  4、versatility 多功用性

  5、performance  性能于好

  6、minimal requirements 需求略(设备的求,内存等)

http://dom4j.sourceforge.net/dependencies.html

Code step-by-step

  首先大家而开创一个解析器实例,这多少个要求发出以下三独步骤:

    1、得到XMLPULL工厂实例

    2、(可摘步骤)默认情形下之厂形式将变没出名称空间的解析器;要重复改setNamespaceAware()函数,必须要让调用

    3、创制一个解析器的实例

以下代码为兑现情势:

XmlPullParserFactory factory=XmlPullParserFactory.newInstance();

factory.setNamespaceAware(true);

XmlPullPArser xpp=factory.newPullParser();

生一致步是安装解析器的输入:

xpp.setInput(new FileReader(args [i]));

 

  接下去就能够起来展开解析了。

    为了找下一个轩然大波,典型的XMLPULL应用将相会频地调用next()函数,直到END_DOCUMENT事件,进程才会晤让截至。

public void processDocument(XmlPullParser xpp)throws Exception{
  int eventType=xpp.getType();
  do{
    if(eventType==xpp.START_DOCUMENT){
    System.out.println(“Start document”);
    }else if(eventType==xpp.END_DOCUMENT){
      System.out.println(“End documen!”);
    }else if(eventType == xpp.START_TAG) {
      processStartElement(xpp);
    } else if(eventType == xpp.END_TAG) {
         processEndElement(xpp);
    } else if(eventType == xpp.TEXT) {
      processText(xpp);
    }
    eventType = xpp.next();
  } while (eventType != xpp.END_DOCUMENT);

}
}

 

  让我们看怎样处理start标记,与拍卖完标签是挺相像之-紧要的区分是停止标签没有性。

 public void processStartElement (XmlPullParser xpp)

    {

        String name = xpp.getName();

        String uri = xpp.getNamespace();

        if ("".equals (uri)) {

            System.out.println("Start element: " + name);

        } else {

            System.out.println("Start element: {" + uri + "}" + name);

        }

    }

 

  现在于我们看怎么样寻找和打印元素内容:

public void processText (XmlPullParser xpp) throws XmlPullParserException

    {

        char ch[] = xpp.getTextCharacters();

        int start = xpp.getTextCharactersStart();

        int length = xpp.getTextCharactersLength();

        System.out.print("Characters:    \"");

        for (int i = start; i < start + length; i++) {

            switch (ch[i]) {

                case '\\':

                    System.out.print("\\\\");

                    break;

                case '"':

                    System.out.print("\\\"");

                    break;

                case '\n':

                    System.out.print("\\n");

                    break;

                case '\r':

                    System.out.print("\\r");

                    break;

                case '\t':

                    System.out.print("\\t");

                    break;

                default:

                    System.out.print(ch[i]);

                    break;

            }

        }

        System.out.print("\"\n");

    }

Junit-jar下载:

http://ebr.springsource.com/repository/app/bundle/version/download?name=com.springsource.org.junit&version=4.8.1&type=binary

 

其次,准备测试案例之有的代码:

package com.hoo.test;

 

import java.io.File;

import java.util.Iterator;

import java.util.List;

import org.dom4j.Attribute;

import org.dom4j.Document;

import org.dom4j.DocumentException;

import org.dom4j.DocumentHelper;

import org.dom4j.Element;

import org.dom4j.Node;

import org.dom4j.QName;

import org.dom4j.dom.DOMAttribute;

import org.dom4j.io.SAXReader;

import org.dom4j.tree.BaseElement;

import org.junit.After;

import org.junit.Before;

import org.junit.Test;

 

/**

 * <b>function:</b> 使用Dom4j操作XML

 * @author hoojo

 * @createDate 2011-8-5 下午06:15:40

 * @file DocumentTest.java

 * @package com.hoo.test

 * @project Dom4jTest

 * @blog http://blog.csdn.net/IBM_hoojo

 * @email hoojo_@126.com

 * @version 1.0

 */

public class DocumentTest {



    private SAXReader reader = null;



    @Before

    public void init() {

        reader = new SAXReader();

    }



    @After

    public void destory() {

        reader = null;

        System.gc();

    }



    public void fail(Object o) {

        if (o != null)

            System.out.println(o);

    }

}

 

# 创建一篇XML文档

文档格式如下:

<?xml version="1.0" encoding="UTF-8"?> 

<catalog> 

    <!--An XML Catalog--> 

    <?target instruction?>

    <journal title="XML Zone" publisher="IBM developerWorks"> 

         <article level="Intermediate" date="December-2001">

             <title>Java configuration with XML Schema</title> 

             <author> 

                 <firstname>Marcello</firstname> 

                 <lastname>Vitaletti</lastname> 

             </author>

           </article>

    </journal> 

</catalog>

 

创办文档代码如下:

/**

 * <b>function:</b>创建文档

 * @author hoojo

 * @createDate 2011-8-5 下午06:18:18

 */

@Test

public void createDocument() {

    //创建一篇文档

    Document doc = DocumentHelper.createDocument();



    //添加一个元素

    Element root = doc.addElement("catalog");

    //为root元素添加注释

    root.addComment("An XML Catalog");

    //添加标记

    root.addProcessingInstruction("target", "instruction");



    //创建元素

    Element journalEl = new BaseElement("journal");

    //添加属性

    journalEl.addAttribute("title", "XML Zone");

    journalEl.addAttribute("publisher", "IBM developerWorks");

    root.add(journalEl);



    //添加元素

    Element articleEl = journalEl.addElement("article");

    articleEl.addAttribute("level", "Intermediate");

    articleEl.addAttribute("date", "December-2001");



    Element titleEl = articleEl.addElement("title");

    //设置文本内容

    titleEl.setText("Java configuration with XML Schema");

    //titleEl.addText("Java configuration with XML Schema");



    Element authorEl = articleEl.addElement("author");

    authorEl.addElement("firstname").setText("Marcello");

    authorEl.addElement("lastname").addText("Vitaletti");



    //可以使用 addDocType() 方法添加文档类型说明。 

    doc.addDocType("catalog", null,"file://c:/Dtds/catalog.dtd"); 

 

    fail(doc.getRootElement().getName());



    //将xml转换成文本

    fail(doc.asXML());



    //写入到文件

    /*XMLWriter output;

    try {

        output = new XMLWriter(new FileWriter(new File("file/catalog.xml")));

        output.write(doc);

        output.close();

    } catch (IOException e) {

        e.printStackTrace();

    }*/

}

*
DocumentHelper是一个文档援手类(工具类),它可以好文档、元素、文本、属性、注释、CDATA、Namespace、XPath的创始,以及使用XPath完成文档的遍历和以文件转换成Document;

parseText完成将xml字符串转换成Doc的效果

Document doc = DocumentHelper.parseText("<root></root>");

createDocument创制一个文档

Document doc = DocumentHelper.createDocument();

设带来参数就会师创一个分包根元素的文档

 

createElement创制一个素

Element el = DocumentHelper.createElement("el");

* Document的addElement方法好吃当下文档添加一个子元素

Element root = doc.addElement("catalog");

* addComment方法可以上加相同截注释

root.addComment("An XML Catalog");

呢root元素添加同段落注释

 

* addProcessingInstruction添加一个标志

root.addProcessingInstruction("target", "instruction");

为root元素添加一个记

 

* new BaseElement可以创建一个元素

Element journalEl = new BaseElement("journal");

 

* addAttribute添加属性

journalEl.addAttribute("title", "XML Zone");

* add添加一个因素

root.add(journalEl);

用journalEl元素添加到root元素中

 

* addElement添加一个素,并重临时因素

Element articleEl = journalEl.addElement("article");

让journalEl元素添加一个子元素article

 

* setText、addText能够设置元素的公文

authorEl.addElement("firstname").setText("Marcello");
authorEl.addElement("lastname").addText("Vitaletti");

* addDocType可以装文档的DOCTYPE

doc.addDocType("catalog", null,file://c:/Dtds/catalog.dtd);

* asXML可以拿文档或因素转换成一段子xml字符串

doc.asXML();
root.asXML();

* XMLWriter类可以将文档写入到文件中

output = new XMLWriter(new FileWriter(new File("file/catalog.xml")));
output.write(doc);
output.close();

 

# 修改XML文档内容

/**

 * <b>function:</b> 修改XML内容

 * @author hoojo

 * @createDate 2011-8-9 下午03:37:04

 */

@SuppressWarnings("unchecked")

@Test

public void modifyDoc() {

    try {

        Document doc = reader.read(new File("file/catalog.xml"));



        //修改属性内容

        List list = doc.selectNodes("//article/@level");

        Iterator<Attribute> iter = list.iterator();

        while (iter.hasNext()) {

            Attribute attr = iter.next();

            fail(attr.getName() + "#" + attr.getValue() + "#" + attr.getText());

            if ("Intermediate".equals(attr.getValue())) {

                //修改属性值

                attr.setValue("Introductory");

                fail(attr.getName() + "#" + attr.getValue() + "#" + attr.getText());

            }

        }



        list = doc.selectNodes("//article/@date");

        iter = list.iterator();

        while (iter.hasNext()) {

            Attribute attr = iter.next();

            fail(attr.getName() + "#" + attr.getValue() + "#" + attr.getText());

            if ("December-2001".equals(attr.getValue())) {

                //修改属性值

                attr.setValue("December-2011");

                fail(attr.getName() + "#" + attr.getValue() + "#" + attr.getText());

            }

        }



        //修改节点内容

        list = doc.selectNodes("//article");

        Iterator<Element> it = list.iterator();

        while (it.hasNext()) {

            Element el = it.next();

            fail(el.getName() + "#" + el.getText() + "#" + el.getStringValue());

            //修改title元素

            Iterator<Element> elIter = el.elementIterator("title");

            while(elIter.hasNext()) {

                Element titleEl = elIter.next();

                fail(titleEl.getName() + "#" + titleEl.getText() + "#" + titleEl.getStringValue());

                if ("Java configuration with XML Schema".equals(titleEl.getTextTrim())) {

                    //修改元素文本值

                    titleEl.setText("Modify the Java configuration with XML Schema");

                    fail(titleEl.getName() + "#" + titleEl.getText() + "#" + titleEl.getStringValue());

                }

            }

        }



        //修改节点子元素内容

        list = doc.selectNodes("//article/author");

        it = list.iterator();

        while (it.hasNext()) {

            Element el = it.next();

            fail(el.getName() + "#" + el.getText() + "#" + el.getStringValue());

            List<Element> childs = el.elements();

            for (Element e : childs) {

                fail(e.getName() + "#" + e.getText() + "#" + e.getStringValue());

                if ("Marcello".equals(e.getTextTrim())) {

                    e.setText("Ayesha");

                } else if ("Vitaletti".equals(e.getTextTrim())) {

                    e.setText("Malik");

                } 

                fail(e.getName() + "#" + e.getText() + "#" + e.getStringValue());

            }

        }



        //写入到文件

        /*XMLWriter output = new XMLWriter(new FileWriter(new File("file/catalog-modified.xml")));

        output.write(doc);

        output.close();*/

    } catch (DocumentException e) {

        e.printStackTrace();

    } catch (Exception e) {

        e.printStackTrace();

    }

}

* reader.read(new
File(“file/catalog.xml”));读取指定xml文件内容及文档中;

*
selectNodes是XPath的查询模式,完成xml文档的询问,传递xpath路径。其接纳办法好参照jdom的xpath的选取形式:

     http://www.cnblogs.com/hoojo/archive/2011/08/11/2134638.html

* getName拿到元素标签号、getValue、getText获取值、文本内容;

*
elementIterator(“title”);获取当前节点生拥有的title元素,重返Iterator;

* elements获取下边有的子元素,重返的凡一个集合List;

 

# 显文档相关音讯

private String format(int i) {

    String temp = "";

    while (i > 0) {

        temp += "--";

        i--;

    }

    return temp;

}

 

/**

 * <b>function:</b>递归显示文档内容

 * @author hoojo

 * @createDate 2011-8-9 下午03:43:45

 * @param i

 * @param els

 */

private void print(int i, List<Element> els) {

    i++;

    for (Element el : els) {

        fail(format(i) + "##" + el.getName() + "#" + el.getTextTrim());

        if (el.hasContent()) {

            print(i, el.elements());

        } 

    }

}

 

/**

 * <b>function:</b>显示文档相关信息

 * @author hoojo

 * @createDate 2011-8-9 下午03:44:10

 */

@Test

public void printInfo() {

    try {

        Document doc = reader.read(new File("file/catalog.xml"));

        fail("asXML: " + doc.asXML());



        fail(doc.asXPathResult(new BaseElement("article")));

        List<Node> list = doc.content();

        for (Node node : list) {

            fail("Node: " + node.getName() + "#" + node.getText() + "#" + node.getStringValue());

        }



        fail("-----------------------------");

        print(0, doc.getRootElement().elements());



        fail("getDocType: " + doc.getDocType());

        fail("getNodeTypeName: " + doc.getNodeTypeName());

        fail("getPath: " + doc.getRootElement().getPath());

        fail("getPath: " + doc.getRootElement().getPath(new BaseElement("journal")));

        fail("getUniquePath: " + doc.getRootElement().getUniquePath());

        fail("getXMLEncoding: " + doc.getXMLEncoding());

        fail("hasContent: " + doc.hasContent());

        fail("isReadOnly: " + doc.isReadOnly());

        fail("nodeCount: " + doc.nodeCount());

        fail("supportsParent: " + doc.supportsParent());

    } catch (DocumentException e) {

        e.printStackTrace();

    }

    fail("getEncoding: " + reader.getEncoding());

    fail("isIgnoreComments: " + reader.isIgnoreComments());

    fail("isMergeAdjacentText: " + reader.isMergeAdjacentText());

    fail("isStringInternEnabled: " + reader.isStringInternEnabled());

    fail("isStripWhitespaceText: " + reader.isStripWhitespaceText());

    fail("isValidating: " + reader.isValidating());

}

 

# 剔除文档内容

/**

 * <b>function:</b> 删除节点内容

 * @author hoojo

 * @createDate 2011-8-9 下午03:47:44

 */

@Test

public void removeNode() {

    try {

        Document doc = reader.read(new File("file/catalog-modified.xml"));

        fail("comment: " + doc.selectSingleNode("//comment()"));

        //删除注释

        doc.getRootElement().remove(doc.selectSingleNode("//comment()"));



        Element node = (Element) doc.selectSingleNode("//article");

        //删除属性

        node.remove(new DOMAttribute(QName.get("level"), "Introductory"));

        //删除元素 节点

        node.remove(doc.selectSingleNode("//title"));



        //只能删除下一级节点,不能超过一级;(需要在父元素的节点上删除子元素)

        Node lastNameNode = node.selectSingleNode("//lastname");

        lastNameNode.getParent().remove(lastNameNode);



        fail("Text: " + doc.selectObject("//*[text()='Ayesha']"));

        Element firstNameEl = (Element)doc.selectObject("//firstname");

        fail("Text: " + firstNameEl.selectSingleNode("text()"));



        //删除text文本

        //firstNameEl.remove(firstNameEl.selectSingleNode("text()"));

        //firstNameEl.remove(doc.selectSingleNode("//firstname/text()"));

        firstNameEl.remove(doc.selectSingleNode("//*[text()='Ayesha']/text()"));



        //删除子元素author

        //node.remove(node.selectSingleNode("//author"));



        fail(doc.asXML());

    } catch (Exception e) {

        e.printStackTrace();

    }

}

 

* 删除注释

doc.getRootElement().remove(doc.selectSingleNode("//comment()"));

除去root元素下边的诠释

 

* 删除属性

node.remove(new DOMAttribute(QName.get("level"), "Introductory"));

去node节点中的名目为level,其值为Introductory的性

 

* 删除元素

node.remove(doc.selectSingleNode("//title"));

去除node节点下的title元素

 

* 删除文本

firstNameEl.remove(firstNameEl.selectSingleNode("text()"));
firstNameEl.remove(doc.selectSingleNode("//firstname/text()"));
firstNameEl.remove(doc.selectSingleNode("//*[text()='Ayesha']/text()"));

去firstNameEl的公文内容

发表评论

电子邮件地址不会被公开。 必填项已用*标注

网站地图xml地图