Blog

Learning Python: Data Science (Day 3)

12. April 2019 | 3 minutes read

Tags: tags

Because of the reordering of the data container, the actual “data science” part is pretty straightforward. If I had kept the dictionary of lists as in the first version, I would have probably needed many split() operators to get to the data I want to process. But now I just need three loops over all keys and one split at the end.

The most time-consuming part of today’s session was to get a pleasing output of the data. Everything else was just as I have already done in other programming/scripting languages. I also found out that the initialization of the data container was flawed because of the elif’s.


TIL:

  • print() formating with end= and sep=
  • string formating with ljust and center
  • functions (I am using these as subroutines - not sure if this is the correct usecase)
#!/usr/bin/python
import re

def main():
	inputFile = "combat.log"
	outputFile = "output.tab"
	playerName = "foo"
	playerServer = "bar"
	logEvents = ["SPELL_DAMAGE", "SPELL_PERIODIC_DAMAGE"]
	targetList = []
	targetDictionary = {}
	spellList = {}
	for l in logEvents:
		spellList[l] = []
	logData =  {}
	
	playerID = playerName+"-"+playerServer
	
	try: 
		inFH = open(inputFile, "rt")
	except IOError:
		print ("\n\n\tERROR (input): Could not open/find",inputFile,"\n")
	
	print("\n\nParsing file ",inputFile,"...",sep="",end="")
	
	#loop through file and do the parsing
	for line in inFH:
		if re.match("^\d+",line) and (re.search(logEvents[0],line) or re.search(logEvents[1],line)) \
			 and re.search(playerID,line):
	
			splitLine = line.split(",")
			
			# dirty hack: split at double whitespace
			timeStamp, eventName = splitLine[0].split("  ")
			
			# not really necessary, but just for easy reading
			targetID = splitLine[5]
			targetName = splitLine[6]
			spellName = splitLine[10]
			damageValue = splitLine[29]

			# add to dictionary
			if eventName not in logData.keys():
				logData[eventName] = {}
			
			if targetID not in logData[eventName].keys():
				logData[eventName][targetID] = {}
			
			if spellName not in logData[eventName][targetID].keys():
				logData[eventName][targetID][spellName] = []
			
			logData[eventName][targetID][spellName].append([timeStamp,damageValue])
	
			# get unique spells and targets
			if spellName not in spellList[eventName]:
				spellList[eventName].append(spellName)
			if targetID not in targetList:
				targetList.append(targetID)	
			# create target ID<->name dictionary
			if targetID not in targetDictionary.keys():
				targetDictionary[targetID] = targetName
	
	
	inFH.close()
	
	print("done\n")

	# 1. get average dmg for all spells
	for e in logEvents:
		print ("Evaluation for event "+e+":\n")
		
		targetLine = ""
		targetNameLength = 0

		for t in targetList:
			targetLine = targetLine+" "+targetDictionary[t]+" |"
			if len(targetDictionary[t]) > targetNameLength:
				targetNameLength = len(targetDictionary[t])
			
		spellNameLength = getSpellnameLength(spellList[eventName])
		space = " " * spellNameLength		

		print(space+" |"+targetLine)
		for s in spellList[e]:
			outLine = s.ljust(spellNameLength)+" |"
			for t in targetList:
				if s in logData[e][t].keys():
					outLine = outLine+" "+str(getAverageDmg(logData[e][t][s])).center(targetNameLength)+" |"
				else:
					outLine = outLine+" "+str(0).center(targetNameLength)+" |"
			print(outLine)

		print()


def getSpellnameLength(data):

	length = 0

	for d in range(0,len(data)):
		if len(data[d]) > length:
			length = len(data[d])

	return length


def getAverageDmg(data):

	totalDamage = 0
	
	for i in range(0,len(data)):
		timeStamp,damageValue = data[i]
		totalDamage += int(damageValue)

	return(round(totalDamage/len(data),2))

if __name__ == "__main__":
	main()

Output to console:

Parsing file combat.log...done

Evaluation for event SPELL_DAMAGE:

                  | "Training Dummy" | "Training Dummy" | "Training Dummy" |
"Fire Blast"      |      6823.0      |        0         |     10097.33     |
"Pyroblast"       |     12529.0      |        0         |     20596.0      |
"Living Bomb"     |      2009.0      |      2009.0      |      2009.0      |
"Meteor"          |     14728.0      |     14728.0      |     14728.0      |
"Fireball"        |      9431.0      |     11316.5      |      8210.0      |
"Heed My Call"    |        0         |        0         |      4207.0      |
"Scorch"          |        0         |        0         |     2561.67      |
"Gutripper"       |        0         |        0         |      2971.0      |
"Dragon's Breath" |        0         |      8366.0      |        0         |
"Flamestrike"     |      7079.0      |      7079.0      |      7079.0      |

Evaluation for event SPELL_PERIODIC_DAMAGE:

                  | "Training Dummy" | "Training Dummy" | "Training Dummy" |
"Living Bomb"     |      580.0       |      1011.5      |      1011.5      |
"Ignite"          |     3115.03      |     2827.86      |     3180.38      |
"Trailing Embers" |      405.87      |      414.55      |      414.55      |
"Meteor Burn"     |     1294.25      |     1294.25      |     1294.25      |

TODO:

  • obviously more data evaluation
  • using a larger log file