Blog

Learning Python: Data Science (Day 2)

10. April 2019 | 2 minutes read

Tags: tags

Day 2 of my Python excursion was all about parsing the raw data from the log into ordered list. I used hashes of arrays extensively for generating transient IR or UV spectra with Perl, so naturally I tried to achieve something similar in this project. The current structure of the parsed log, however, just lets me filter it by the event name. Then I have to loop over all entries for more data:

	dictionary["key"][0...n] -> [data1, data2, data3, ...]

In the next session I will probably move it to a “nested” dictionary if I find a reasonable order of the keys.

	dictionary["key1"]["key2"]["key3"]... = data

I found it also useful to directly filter unique identifiers when reading the file instead of looping through the dictionary later.

(I also still do not like the fact that Python leaves ot the curly brackets for loops/blocks.)


TIL:

  • Perl hashes = dictionaries in Python
  • dictionary of lists + required initialization

(I omit any comments from my original file here)

#!/usr/bin/python
import re

inputFile = "combat.log"
playerName = "foo"
playerServer= "bar"
logEvents = ["SPELL_DAMAGE", "SPELL_PERIODIC_DAMAGE"]
targetList = []
targetDictionary = {}
spellList = {}
for l in logEvents:
	spellList[l] = []
logData =  {}
for l in logEvents:
	logData[l] = []

playerID = playerName"-"+playerServer

try:
        inFH = open(inputFile, "rt")
except IOError:
        print ("\n\n\tERROR (input): Could not open/find",inputFile,"\n")

for line in inFH:
        if re.match("^\d+",line) and (re.search(logEvents[0],line) or re.search(logEvents[1],line)) \
	and re.search(playerID,line):

			splitLine = line.split(",")

			timeStamp, eventName = splitLine[0].split("  ")
			lineData = {"timeStamp": timeStamp, "targetID": splitLine[5],\
			 "spellName": splitLine[10], "damageValue": splitLine[29]}

			logData[eventName].append(lineData)

			if splitLine[10] not in spellList[eventName]
				spellList[eventName].append(splitLine[10])
			if splitLine[5] not in targetList:
				targetList.append(splitLine[5])
			if splitLine[5] not in targetDictionary.keys():
				targetDictionary[splitLine[5]] = splitLine[6]

inFH.close()

print ("Used spells:\n", spellList)

Next steps:

  • maybe change the layout of the logData dictionary
  • start generating some information from the data