In this blog we are going to explore spath command in splunk . spath command used to extract information from structured and unstructured data formats like XML and JSON. This command extract fields from the particular data set. This command also use with eval function.
Splunk has capabilities to extract field names and JSON key value by making KV_MODE=_JSON .but some for complex data fileds are not getting extracted for that extraction we are using spath command
So we have three different types of data structured ,unstructured and xml data formats
Structured json data
"level": "info",
"programs": [{
"season": "1842-43",
"orchestra": "New York Philharmonic",
"concerts": [{
"Date": "1842-12-07T05:00:00Z",
"eventType": "Subscription Season",
"Venue": "Apollo Rooms",
"Location": "Manhattan, NY",
"Time": "8:00PM"
}],
"programID": "3853",
"works": [{
"workTitle": "SYMPHONY NO. 5 IN C MINOR, OP.67",
"conductorName": "Hill, Ureli Corelli",
"ID": "52446", "soloists": [], "composerName": "Beethoven, Ludwig van" }, { "workTitle": "OBERON", "composerName": "Weber, Carl Maria Von", "conductorName": "Timm, Henry C.", "ID": "88344",
"soloists": [{
"soloistName": "Otto, Antoinette",
"soloistRoles": "S",
"soloistInstrument": "Soprano"
}],
In this json data if we want to extract value of programs we can simply use spath command to extract field for programs
index=test source="raw.json"| spath path=programs{}
Similarly in above given data we can see that in programs we have array in that we have field called season for extracting season field we have to enter in programs for this we are using {} brackets and .dot for current fields.
index=test source="raw.json"| spath path=programs{}.season
As we can see that programs field has array . in that array we have field work that field also has array in it for extracting that information we can simply use below given syntax
index=test source="raw. json"| spath path=programs{}.works{}.workTitle
Unstructured JSON Data
2013-12-23T14:55:09.574+0000|INFO|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=102;_ThreadName=Thread-2;|2013-12-23 14:55:09,574 DEBUG parent-container$child#1-10 [] com.abc.transform.listeners.xyz- [{ "timestamp" : "2013-12-23T14:55:09.558Z", "host" : "myPC", "event_id" : "1234", "customer_id" : "123456", "country" : "Canada", "product" : "iPad", "msg" : "Hello Guys", "transaction_id" : "100200300400" }]
The above given data is unstructured json data for extracting field in between is very complex .its is possible by using spath command and rex command .
If we want to extract field country we simply use below given syntax
Index=test source="mixed.json "| rex field=_raw "com.abc.transform.listeners.xyz-(?.+)"| spath input=json_field path={}.country output = country
The json staring from array so we use {} in syntax and output is for renaming extracting field.
Xml data
<?xml version="1.0">
<purchases>
<book>
<author>Martin, George R.R.</author>
<title yearPublished=1996>A Game of Thrones</title>
<title yearPublished=1998>A Clash of Kings</title>
</book>
<book>
<author>Clarke, Susanna</author>
<title yearPublished=2004>Jonathan Strange and Mr. Norrell</title>
</book>
<book>
<author>Kay, Guy Gavriel</author>
<title yearPublished=1990>Tigana</title>
</book>
<book>
<author>Bujold, Lois McMasters</author>
<title yearPublished=1986>The Warrior's Apprentice</title>
</book>
</purchases>
Spath command Is very useful for extracting xml data .in the above given xml data we can see data is related to books for extracting fields in xml data we simply use .dot and attributes values .if we want to extract title yearPublished we can achive by spath syntax are given below
index=test source="test.xml"| spath output=yearpublished path=purchases.book.title{@yearPublished}
This are some uses of spath command for extracting fields in json and xml data formats
If you are still facing an issue regarding spath command in splunk, Feel free to Ask Doubts in the Comment Section Below and Don’t Forget to Follow us on 👍 Social Networks. Happy Splunking 😉