To pretty print whatever JSON stream a good starting point is to use the GSON library. It is a nice component that lets you
- serialize JSON streams starting from a set of Java objects or
- convert a JSON stream into an equivalent set of JAVA objects
To prepare PDI to run this example you must:
- Download the GSON library from the following link. In my case I've downloaded the version 2.2.3 but consider the same steps with other versions of the library.
- Unzip the file on a temporary directory
- Copy the gson-2.2.3.jar file to the <PDI_HOME>/libext directory
- Restart PDI
The how-to
First of all I started by making an example to obtain an ugly JSON sample stream to format. To do this I built a new transformation by reusing the input files of the sample multilayer xml file transformation to obtain a simple JSON stream. The interesting part is at the very end of this transformation. Again you have a User Defined Java Class step that contains all the code that does the dirty job for you.
1: import com.google.gson.Gson;
2: import com.google.gson.GsonBuilder;
3: import com.google.gson.JsonParser;
4: import com.google.gson.JsonElement;
5: String jsonOutputField;
6: String jsonPPField;
7: public boolean processRow(StepMetaInterface smi, StepDataInterface sdi) throws KettleException
8: {
9: // First, get a row from the default input hop
10: //
11: Object[] r = getRow();
12: // If the row object is null, we are done processing.
13: //
14: if (r == null) {
15: setOutputDone();
16: return false;
17: }
18: // Let's look up parameters only once for performance reason.
19: //
20: if (first) {
21: jsonOutputField = getParameter("JSONOUTPUT_FIELD");
22: jsonPPField = getParameter("JSONPP_FIELD");
23: first=false;
24: }
25: // It is always safest to call createOutputRow() to ensure that your output row's Object[] is large
26: // enough to handle any new fields you are creating in this step.
27: //
28: Object[] outputRow = createOutputRow(r, data.outputRowMeta.size());
29: logBasic("Input row size: " + r.length);
30: logBasic("Output row size: " + data.outputRowMeta.size());
31: String jsonOutput = get(Fields.In, jsonOutputField).getString(r);
32: Gson gson = new GsonBuilder().setPrettyPrinting().create();
33: JsonParser jp = new JsonParser();
34: JsonElement je = jp.parse(jsonOutput);
35: String jsonpp = gson.toJson(je);
36: // Set the value in the output field
37: //
38: get(Fields.Out, jsonPPField).setValue(outputRow, jsonpp);
39: // putRow will send the row on to the default output hop.
40: //
41: putRow(data.outputRowMeta, outputRow);
42: return true;
43: }
This time the interesting code is between lines 32 and 35:
- The Gson object is created enabling the pretty printing.
- The JSON stream that is coming in is read and parsed appropriately (lines 33-34)
- A new JSON stream pretty printed is built and used to fill the jsonpp rowset field (line 35)
Next the output pretty printed JSON stream is saved to a .js file and that's all.
You can download the sample transformation from this link. I hope you enjoyed this two part article and that it can be useful for you.
No comments:
Post a Comment