Thread Tools Display Modes
04-06-15, 11:33 AM   #1
Dimmulux
A Deviate Faerie Dragon
Join Date: Feb 2014
Posts: 17
Post How can I obtain an offline csv of all battle pets, their statistics and abilities?

Note that this question does not relate to an addon. As part of a university project (concerning game playing algorithms), I find that I need an offline information store containing a listing of the battle pets, their statistics and their abilities (including all effects of their abilities in some sensible format). The focus of the project is running algorithms (in Java) on the information, not the collection of the information itself. A listing of all pets (rather than just manually copying across the information about a few pets) is desirable so that I can test my algorithms on a large number of different combinations. I have read through the thread http://www.wowinterface.com/forums/s...ad.php?t=49083 and tried the relevant suggestions as discussed below.

Question: How can I get an offline database/csv containing a listing of all battle pets, their statistics and their abilities?

What I've tried so far:


1. Web scraping from wowhead and warcraftpets. In both cases, I was only attempting to scrape a single page as a test case (http://www.wowhead.com/petspecies and http://www.warcraftpets.com/wow-pets/filter/ respectively). In neither case did I retrieve the information I wanted (a partial listing of pets). I tried using Jsoup first, but this was unsuccessful as both pages load their content using ajax. Online, I found that htmlunit was recommended for working with pages using ajax, but the following exceptions occurred:
on wowhead:
com.gargoylesoftware.htmlunit.ScriptException: Wrapped com.gargoylesoftware.htmlunit.ScriptException: Exception invoking close
on warcraftpets:
com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot call method "replace" of undefined (http://j.adlooxtracking.com/ads/js/t...one62v_1.js#14)

I can copy/paste the entire stack traces if they would be helpful, but they are hundreds of lines long.


2. Appending &xml to wowhead URLs I plan to scrape. This does not seem to work outside of item pages. Importantly, it doesn't work with NPC or pet ability pages.


3. Using the official pet API: https://github.com/Blizzard/api-wow-docs#battlepet-api. This seems to miss some necessary information:
  1. What are the valid values of species?
  2. Given a species, what are the valid values of breed?
  3. What effects does an ability have? (something of a form similar to: damage(20 + power)).

I have not yet done so but, if I find a method that works, I intend to retrieve information in that way for each pet and each ability. This would mean making roughly 1500-10000 requests (a very rough estimate, but should give an idea of order of magnitude) sequentially over a residential network connection. Introducing an artificial delay between the requests would be possible. Would this go against the terms of use of any of the sites in question? I cannot find any mention of this on either sites' TOS: http://www.wowhead.com/tos or www.warcraftpets.com/help/.Would it be advisable to contact the sites' owners to check that it's okay?

Any suggestions on how to proceed from here would be very helpful. I also welcome any comments on what I have done (wrong) so far.
  Reply With Quote
04-06-15, 12:36 PM   #2
Petrah
A Pyroguard Emberseer
 
Petrah's Avatar
AddOn Author - Click to view addons
Join Date: Jan 2008
Posts: 2,988
I recently found myself wanting my own offline list similar to the one from http://petsear.ch/Pets. It would be so much easier since battle.net is slow to gather info about your newly obtained/newly leveled pets.

I would love to have the same info as petsear.ch in a spreadsheet that you can snag from in game. There used to be another addon for gathering guild information in the same way (from in game, copied so you can paste to a spreadsheet) but I cannot recall the name of the addon.
__________________
♪~ ( ) I My Sonos!
AddOn Authors: If your addon spams the chat box with "Addon v8.3.4.5.3 now loaded!", please add an option to disable it!
  Reply With Quote
04-06-15, 01:24 PM   #3
Dimmulux
A Deviate Faerie Dragon
Join Date: Feb 2014
Posts: 17
Yes, something similar to the spreadsheet downloadable from http://petsear.ch/Breeds would be great. This would solve my problem if it were to also give the abilities of the pets and relevant numbers.

Thank-you for the link and comment. I found looking at this website helpful.
  Reply With Quote
04-06-15, 04:30 PM   #4
Barjack
A Black Drake
AddOn Author - Click to view addons
Join Date: Apr 2009
Posts: 89
I tried contacting Wowhead a couple of times about scraping their data without much luck (never got a response) and in the end I just decided to go ahead and do it since I wasn't going to use it for any nefarious purpose.

In my case I need item information, so it's a far bigger scrape than what you're doing. I need both the XML and HTML versions of each page, and there are a lot of items in WoW (almost 100,000). So I end up making about 200,000 requests. I wrote a custom Ruby script to download them all that can use any number of simultaneous threads. On my cable connection and if I set the thread count to 20, the XML half only takes about an hour but the HTML half takes about 3 hours.

Then I parse them all using another Ruby script using Nokogiri (XML/HTML parser) and simple regular expressions. The site uses AJAX often so there are often JSON objects sitting around on various pages that you can load into memory with a JSON parser, too. These are sometimes more convenient than scraping the HTML.
  Reply With Quote
04-06-15, 04:57 PM   #5
JDoubleU00
A Firelord
 
JDoubleU00's Avatar
AddOn Author - Click to view addons
Join Date: Mar 2008
Posts: 463
Sorry to post off topice, but wow petser.ch is pretty awesome.
__________________
Author of JWExpBar and JWRepBar.
  Reply With Quote
04-06-15, 05:52 PM   #6
MoonWitch
A Firelord
AddOn Author - Click to view addons
Join Date: Sep 2007
Posts: 455
Originally Posted by Barjack View Post
Then I parse them all using another Ruby script using Nokogiri (XML/HTML parser) and simple regular expressions. The site uses AJAX often so there are often JSON objects sitting around on various pages that you can load into memory with a JSON parser, too. These are sometimes more convenient than scraping the HTML.
Out of sheer curiosity - could you share that script? (I am trying to learn Regex, and Nokogiri :P )
__________________
  Reply With Quote
04-06-15, 07:24 PM   #7
Banknorris
A Chromatic Dragonspawn
 
Banknorris's Avatar
AddOn Author - Click to view addons
Join Date: Oct 2014
Posts: 153
Maybe those are the files you need (let me know if you can't see them):
http://www.4shared.com/folder/K-KvnCcd/_online.html

I used the same technique as in this thread:
http://www.wowinterface.com/forums/s...ad.php?t=49599

Last edited by Banknorris : 04-07-15 at 12:08 AM.
  Reply With Quote
04-06-15, 11:33 PM   #8
elcius
A Cliff Giant
AddOn Author - Click to view addons
Join Date: Sep 2011
Posts: 75
wowhead scraping will be the easiest option.
for pet data:
Code:
var a = g_listviews.petspecies.data;
for(var i=0;i<a.length;i++){
	var p = a[i];
	console.log( p.species+',"'+p.name+'",'+(p.abilities||[]).join(',')+'' );
}
for abilities:
Code:
var a = g_listviews.petabilities.data;
for(var i=0;i<a.length;i++){
	var s = a[i];
	console.log( s.id+',"'+s.name+'",'+([s.damage,s.healing,s.duration,s.accuracy,s.type]).join(',')+'' );
}
shift-f5 for tools (firefox), go to appropriate page and run in the console, you'll get CSV lists, change them as needed.
  Reply With Quote
04-07-15, 03:04 AM   #9
Dimmulux
A Deviate Faerie Dragon
Join Date: Feb 2014
Posts: 17
Thanks for all the replies.

Originally Posted by Barjack View Post
I tried contacting Wowhead a couple of times about scraping their data without much luck (never got a response) and in the end I just decided to go ahead and do it since I wasn't going to use it for any nefarious purpose.
That's good to know, thank-you. If I end up needing to scrape from Wowhead, I'll just do it.

Originally Posted by Banknorris View Post
Maybe those are the files you need (let me know if you can't see them):
http://www.4shared.com/folder/K-KvnCcd/_online.html

I used the same technique as in this thread:
http://www.wowinterface.com/forums/s...ad.php?t=49599
These look very much like the files I need, yes. Thank-you very much for uploading them. I've had a look through them all and it looks like they are missing the pet names. It's possible that they are missing other information as well, but I can't easily tell as only the column types (not the names) have been kept. With some time, I might be able to deduce what statistic each column represents, but it certainly won't be easy when there are lots of columns of ints.

This is definitely the best find so far, though. I should be able to match the speciesId from these csvs with the speciesId from the breedsperpet csv (obtained by converting the spreadsheet from petsear.ch to a csv) to get the pet names. I'm very grateful for these csvs.

Originally Posted by elcius View Post
wowhead scraping will be the easiest option.
for pet data:
Thanks for the suggestion. As I understand it, this method would require me to visit each of the pages manually. As there are over 700 player-ownable pets, this would take a while. A greater problem, though, is that the ability information obtained in this way would not be comprehensive enough. For example, it ignores all status effects (DoT damage, weather and all other buffs/debuffs).
  Reply With Quote
04-07-15, 03:38 AM   #10
elcius
A Cliff Giant
AddOn Author - Click to view addons
Join Date: Sep 2011
Posts: 75
As I understand it, this method would require me to visit each of the pages manually. As there are over 700 player-ownable pets, this would take a while.
nope, all the abilities and possible breeds are listed on the page you linked, if there is info not available in the list-view (nothing relevant that i can see), you can use their jQuery to handle the requests:
Code:
var a = g_listviews.petspecies.data;
var n = a.length;
for(var i=0;i<a.length;i++)(function(i){
	$.get('http://www.wowhead.com/npc='+a[i].npc.id, function(html){
		console.log((--n)+' remaining');
		// pattern match the html for whatever.
	});
})(i);
takes less than a minute to go through the page you linked.
  Reply With Quote
04-07-15, 05:59 AM   #11
Banknorris
A Chromatic Dragonspawn
 
Banknorris's Avatar
AddOn Author - Click to view addons
Join Date: Oct 2014
Posts: 153
I also uploaded the creature.db2.csv. To get the names:
from BattlePetSpecies.db2.csv second column you get the npc number of the pet,
creature.db2.csv first column is also the npc number then you look at column 15 that will be the name.

I could not find out how to get a list of abilities from a pet species but maybe you can get them with this api:
http://wow.gamepedia.com/API_C_PetJo...PetAbilityList

Last edited by Banknorris : 04-07-15 at 11:04 AM.
  Reply With Quote
04-08-15, 04:08 PM   #12
Dimmulux
A Deviate Faerie Dragon
Join Date: Feb 2014
Posts: 17
nope, all the abilities and possible breeds are listed on the page you linked, if there is info not available in the list-view (nothing relevant that i can see), you can use their jQuery to handle the requests:
Ah, I see. Thanks for clearing that up for me. I suppose there would still be the issue that this is just one of the pages listing pets (there are 50 such pages), but that would certainly be quicker than 700+. I think a greater problem is that, on inspection, the Wowhead page for each ability doesn't seem to have enough information. Unlike player ability pages, there is no indication of which auras/status effects (if any) are applied by the ability. Parsing the description text to deduce the auras applied might be possible, but would be unpleasant, error-prone and a poor way of finding the information.

I also uploaded the creature.db2.csv. To get the names:
from BattlePetSpecies.db2.csv second column you get the npc number of the pet,
creature.db2.csv first column is also the npc number then you look at column 15 that will be the name.

I could not find out how to get a list of abilities from a pet species but maybe you can get them with this api:
http://wow.gamepedia.com/API_C_PetJo...PetAbilityList
Thanks for the file. I downloaded it but after a bit of thought it seemed more sensible to use the CSV I got through petsear.ch as I can calculate the base stats from that file too. I've parsed that information in and now have it stored in a CSV format more appropriate for my program.

I've spent a fair bit of time looking through the other files, but without column headings I'm really quite lost. The strings aren't a problem, of course and I think I've found all fields that store ids, but I can't work out how to interpret the other numeric values. Is it possible to obtain the column headers or some comments on how to interpret the files? I realise these are taken directly from the game database, so I can understand if there is no proper documentation for them.

That API may find me the abilities for the pets, so it could be useful, but it won't get the effects of each ability as well. If all the information I need is contained in the other CSVs, it would seem more straightforward to get it all from them, if they can be understood.
  Reply With Quote
04-08-15, 05:21 PM   #13
Petrah
A Pyroguard Emberseer
 
Petrah's Avatar
AddOn Author - Click to view addons
Join Date: Jan 2008
Posts: 2,988
Just my opinion, but I don't think scraping any site for this particular information is the best option. Blizzard is extremely slow to get new pet info updated to characters on Battle.net.

Gathering the info with an addon from inside the game is the best option for updated information. It's killin' me that I cannot remember the name of a guild addon that does this exact thing where all you do is click a button, copy all the lines, and paste it into an Excel spreadsheet.
__________________
♪~ ( ) I My Sonos!
AddOn Authors: If your addon spams the chat box with "Addon v8.3.4.5.3 now loaded!", please add an option to disable it!
  Reply With Quote
04-08-15, 05:43 PM   #14
Dimmulux
A Deviate Faerie Dragon
Join Date: Feb 2014
Posts: 17
Originally Posted by Petrah View Post
Just my opinion, but I don't think scraping any site for this particular information is the best option. Blizzard is extremely slow to get new pet info updated to characters on Battle.net.
I agree that scraping doesn't seem like the best solution here, but rather for the reason that I can't find any site that has all the information I'm after. I'm not looking for account/character-specific information, though: I'm just after a listing of all pets, abilities and their effects. I don't think it causes a problem for me if information on which accounts own which pets takes a long time to update through Battle.net as this isn't the information I'm after anyway.

Originally Posted by Petrah View Post
Gathering the info with an addon from inside the game is the best option for updated information. It's killin' me that I cannot remember the name of a guild addon that does this exact thing where all you do is click a button, copy all the lines, and paste it into an Excel spreadsheet.
I don't know if there is any API that gives the effects of each ability (outside the text description). If so, this could well be a good option.

Last edited by Dimmulux : 04-08-15 at 06:07 PM.
  Reply With Quote
04-08-15, 05:58 PM   #15
Petrah
A Pyroguard Emberseer
 
Petrah's Avatar
AddOn Author - Click to view addons
Join Date: Jan 2008
Posts: 2,988
I'm after the same thing you are, but with a bit extra too..... levels, id's, spells, etc. etc. etc.

After a full day of gathering and leveling new battle pets, I go to either petsear.ch or warcraftpets and attempt to snag a list of those new updates and nothing new shows up. There've been times where I have had to wait 4 to 5 days for that new info to show up on my battle.net characters. Extremely annoying.
__________________
♪~ ( ) I My Sonos!
AddOn Authors: If your addon spams the chat box with "Addon v8.3.4.5.3 now loaded!", please add an option to disable it!
  Reply With Quote

WoWInterface » Developer Discussions » General Authoring Discussion » How can I obtain an offline csv of all battle pets, their statistics and abilities?

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off