Before I get started with this thread, I'd like to say that I am only just learning java, and can't get enough of it. I'm extremely eager to understand what's in a file format, and how to be able to create my own.
My interest in this probably lies in the fact that I've always been interested in game modification. And in that field, if you don't have public tools, you probably don't have anything that can read all those files.
I'm going to be teaching myself (hopefully with the help of you guys) to access some files that no public tool exists for. They are game files created for that game, but they're old and small so once I get a grip on how files in general are structured, I should be able to start to decode them fairly quickly.
My questions are these.
1) Where should I start my research on understanding the structure of a file? I understand a little, but are there good resources for this?
2) What does it take to create a file? What level are they usually made on?
3) Any help with the language of this field? Obviously, I'm illiterate as of now.
4) Am I getting myself way in over my head. If all it needs is brains and self motivation, I'll be all over this.
If I've said anything dumb, feel free to correct me. If this is in the wrong place, feel free to move it. You know how it goes. The only thing that I'll ask is please be friendly, because I really do want to learn.
Thanks!
Well, there's generally two ways to store data. You can use a binary format or a more or less human readable format. For both formats there's different ways of encoding data.
Binary formats differ a lot from each other, but plain text formats are usually encoded in XML, JSON or another well known encoding method. To get the idea of how binary formats work, [url=http://en.wikipedia.org/wiki/BMP_file_format]BMP[/url] is one of the easiest formats to understand.
Java is pretty retarded about reading binary files. You can't seek (except with RandomAccessFile), you can't read data structures all at once, etc. It doesn't support unsigned values either, which makes life difficult when the file format calls for it.
It's a total nightmare on Android where you can't create a RandomAccessFile directly from an asset. I learned this the hard way.
If your main goal is just to learn about file formats and you aren't tied to a language because you've already started a project with it, use C or C++. They're a lot less difficult to work with in this case.
By Java I mean I'm taking a class on it. These aren't subjects we're learning. The farthest I think we'll ever get into binary in that class is learning to count (already did).
I'm looking at the hex for a simple 2x2 bmp I made, and I think I'm understanding so far. The only thing confusing me is "color plane". I'm having trouble understanding it, but I'll keep working at it.
Thanks for the help so far, guys!
We're hardly into the class, so I don't understand really anything about the language so far :(
For creating tools to work with video game file formats, it's a good idea to understand how textures/sounds/models are stored, both in binary and readable formats. Some games, like Far Cry 2, will be lazy and take a standard DDS texture, slap a header on it, and call it an XBT file. Others will do more complicated things which involve a lot of work to view. They may compress it with DXT, but store it in a very strange way, or they would optimize it for consoles and it will be stored in big endian instead of the x86 method, little endian. You would need to convert that.
Making your own file format is simple, just use a char or series of chars to identify the file as your file format, then store your data as variables seperated by commas (something simple to work with). You can google CSV (comma separated value) parsers and such if you want to read them back to your program.
[QUOTE=DOG-GY;24502088]I'm looking at the hex for a simple 2x2 bmp I made, and I think I'm understanding so far. The only thing confusing me is "color plane".[/QUOTE]
I'm pretty sure "color planes" aren't totally essential to reading a bitmap. You can just discard that value.
I'm semi-familiar with the format, as I've written a bmp loader that I used in a few projects.
[QUOTE=robmaister12;24502247]store your data as variables seperated by commas (something simple to work with). You can google CSV (comma separated value) parsers and such if you want to read them back to your program.[/QUOTE]
I fail to see the advantages here.
It's not human-readable like XML or something and it's not as compact and easy to parse as a binary file.
Trying to wrap my head around bitmaps. The one I made was stored a bit differently than wikipedia's but that's ok. I'm working towards it.
I'm also look at the OBJ filetype because it's very easily read. The files that I've got seem to be both readable and binary. They contain meshes, so once I get to understanding it, I may be able to easily convert between the two.
As far as readable goes they all have "VERT" "INDL" and "GRPL" in them. Seeing vert seems like a good sign to me, but I could just be an idiot.
[editline]08:22PM[/editline]
VERT is at the top of the file, and I'm assuming those are the locations of the vertices. I don't know about INDL and GRPL, but they've definitely got something going on. Some files also have GTOM and MATL. I'm guessing the latter is material.
This is the webpage I've used when I've written BMP readers (and writers) in C. [url]http://www.fortunecity.com/skyscraper/windows/364/bmpffrmt.html[/url]
(This is not language specific)
[editline]12:09AM[/editline]
Also, I made a program in C a few weeks ago that reads a BMP image and converts it into text characters that look sort-of like the image. In the code you can easily (relatively) see how I parsed the BMP image. [url]http://minipenguin.com/bta/bta.c[/url]
[QUOTE=DOG-GY;24503221]Trying to wrap my head around bitmaps. The one I made was stored a bit differently than wikipedia's but that's ok. I'm working towards it.
I'm also look at the OBJ filetype because it's very easily read. The files that I've got seem to be both readable and binary. They contain meshes, so once I get to understanding it, I may be able to easily convert between the two.
As far as readable goes they all have "VERT" "INDL" and "GRPL" in them. Seeing vert seems like a good sign to me, but I could just be an idiot.
[editline]08:22PM[/editline]
VERT is at the top of the file, and I'm assuming those are the locations of the vertices. I don't know about INDL and GRPL, but they've definitely got something going on. Some files also have GTOM and MATL. I'm guessing the latter is material.[/QUOTE]
What type of obj file are you looking at?
[QUOTE=DOG-GY;24503221]VERT is at the top of the file, and I'm assuming those are the locations of the vertices. I don't know about INDL and GRPL, but they've definitely got something going on. Some files also have GTOM and MATL. I'm guessing the latter is material.[/QUOTE]
It sounds like you are not looking at the [url=http://www.royriggs.com/obj.html][b]Lightwave OBJ format[/b][/url].
From: [url]http://www.royriggs.com/obj.html[/url]
[quote]
The first character of each line specifies the type of command. If the first character is a pound sign, #, the line is a comment and the rest of the line is ignored.[/quote]
Pound sign = £
I don't trust that specification anymore.
[QUOTE=eXeC64;24587219]From: [url]http://www.royriggs.com/obj.html[/url]
Pound sign = £
I don't trust that specification anymore.[/QUOTE]
Uh, no.
[url]http://en.wikipedia.org/wiki/Number_sign[/url]
[quote=Wikipedia]
In the United States, the symbol is usually called the pound sign, and the key bearing this symbol on touch-tone phones is called the pound key.
[/quote]
[QUOTE=eXeC64;24587219]From: [url]http://www.royriggs.com/obj.html[/url]
Pound sign = £
I don't trust that specification anymore.[/QUOTE]
What you have up there is the pound sterling sign.
Sorry, poor wording. I'm not looking at OBJ files. I was looking at them to see how easy it would be to convert to one once I understand my file type. It should be piss easy since obj is plain text.
I'm looking at files that were designed specifically for a game.
There's no universal filetype. Someone can make a game and make a filetype for a level that's completely unique. Knowing how to read/write one filetype will not automatically allow you to handle other filetypes.
What filetype are you specifically trying to use?
[QUOTE=jmazouri;24587265]Uh, no.
[url]http://en.wikipedia.org/wiki/Number_sign[/url][/QUOTE]
It's interesting how North America insists on having it's own meanings and spellings for everything. Are they compensating for something? Why not just use the existing spellings and conventions?
England and America are two countries separated by a common language.
--George Bernard Shaw
[QUOTE=eXeC64;24602577]It's interesting how North America insists on having it's own meanings and spellings for everything. Are they compensating for something? Why not just use the existing spellings and conventions?
England and America are two countries separated by a common language.
--George Bernard Shaw[/QUOTE]
there are more people in the world using north american english than your silly european english, why you gotta be different!
(posting to point out derpiness)
[QUOTE=Mattk50;24602847]there are more people in the world using north american english than your silly european english, why you gotta be different!
(posting to point out derpiness)[/QUOTE]
Not true, really. Only Americans use American. Most others speak English, that includes people whose native language isn't English.
[QUOTE=arienh4;24611696]Not true, really. Only Americans use American. Most others speak English, that includes people whose native language isn't English.[/QUOTE]
Actually, AmE seems to be the most used English variant on the internet from my experience.
[QUOTE=esalaka;24612096]Actually, AmE seems to be the most used English variant on the internet from my experience.[/QUOTE]
True, but not even a fraction of the people who speak English participate on forums and the like.
[QUOTE=esalaka;24612096]Actually, AmE seems to be the most used English variant on the internet from my experience.[/QUOTE]
I'm not a native English speaker and I have no idea what I'm speaking.
Technically I've been thought British English in school but most of the content on TV is American...
Wait, how did this go from XML, CSV, or binary data storage to the percentage of the world that speaks American English?
I've actually just had the need to save a file, and I'm keeping it simple by using a TextWriter and a TextReader to write one value per line. I was considering doing it in binary, storing all the booleans in a BitArray, padding it with 0's until it's a multiple of 8, then converting it up to bytes and writing the byte array to a file, but then I thought about the rest of the file and how it would fit, and it really wouldn't unless I were writing memory addresses of ints and strings to a file, and reading them back would be tricky considering I don't know when one string ends and another starts.
But then it all depends on your program's needs. A line delimited file works well for my program, which needs to store something like 40 bools to file, 10 ints, a few enums, a few strings, and a string array. If you have to store positions of items marked by ID numbers or something, keeping it in binary would work well.
[QUOTE=arienh4;24612165]True, but not even a fraction of the people who speak English participate on forums and the like.[/QUOTE]
"Not even a fraction" implies no English speakers at all, to be accurate.
Also, I prefer British English, it's much more awesome-r than its American version. :smile:
[editline]03:39PM[/editline]
[QUOTE=robmaister12;24612550]Wait, how did this go from XML, CSV, or binary data storage to the percentage of the world that speaks American English?[/QUOTE]
No idea.
[QUOTE=esalaka;24613009]"Not even a fraction" implies no English speakers at all, to be accurate.
Also, I prefer British English, it's much more awesome-r than its American version. :smile:[/QUOTE]
I know that's what it means literally, but I choose not to treat English as maths.
[QUOTE=robmaister12;24612550]Wait, how did this go from XML, CSV, or binary data storage to the percentage of the world that speaks American English?[/QUOTE]
arien
Exec64 actually.
[QUOTE=pikzen;24615363]Exec64 actually.[/QUOTE]
True, but around these parts I escalate the arguments.
[QUOTE=dag10;24594468]There's no universal filetype. Someone can make a game and make a filetype for a level that's completely unique. Knowing how to read/write one filetype will not automatically allow you to handle other filetypes.
What filetype are you specifically trying to use?[/QUOTE]
Huh? All I'm doing is taking mesh files from an old game and converting them to obj. All the mesh files are alike in structure, with some of them having material information.
This is also on hold for a week while I polish my portfolio. But thanks guys, and keep the American English debate elsewhere, please.
If you ever get into 3D games programming, write an ibsp loader/renderer. It's really neat the way they do things. Also, well-documented.
Sorry, you need to Log In to post a reply to this thread.