Bug 85 - Saving large .gig files in gigedit crashes the program
Summary: Saving large .gig files in gigedit crashes the program
Status: CLOSED DUPLICATE of bug 90
Alias: None
Product: libgig
Classification: Unclassified
Component: libgig (show other bugs)
Version: 3.2.1
Hardware: PC Linux
: P2 normal
Assignee: Andreas Persson
URL:
Depends on:
Blocks:
 
Reported: 2008-04-10 11:54 CEST by Devin Anderson
Modified: 2008-12-11 02:37 CET (History)
0 users

See Also:


Attachments
configure.in diff (31 bytes, patch)
2008-04-25 10:10 CEST, Devin Anderson
Details
DLS.cpp diff (62 bytes, patch)
2008-04-25 10:11 CEST, Devin Anderson
Details
DLS.h diff (170 bytes, patch)
2008-04-25 10:12 CEST, Devin Anderson
Details
RIFF.cpp diff (64 bytes, patch)
2008-04-25 10:13 CEST, Devin Anderson
Details
RIFF.h diff (328 bytes, patch)
2008-04-25 10:14 CEST, Devin Anderson
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Devin Anderson 2008-04-10 11:54:03 CEST
If I attempt to save a .gig file that will be 2 GB or more, I get the following
error:



glibmm-ERROR **:
unhandled exception (type unknown) in signal handler

aborting ...
Aborted



At first, I thought there was a ulimit set, but `ulimit -a` shows "unlimited"
for file size.  So, I wrote a quick python script to create a 4 GB file, which
succeeded.

The reason that I've filed this bug under 'libgig' instead of 'gigedit' is
because I can see that 'libgig' uses 'lseek':

    off_t lseek(int fildes, off_t offset, int whence);

... and assumes that the 'unsigned long' offset it passes will be correctly cast
to an 'off_t'.

This is all well and dandy until the offset gets above 0x7fffffff.

Here is a small program I wrote to test off_t conversion:



#include <stdio.h>
#include <sys/types.h>

int main(void)
{
    unsigned long l = 0x7fffffff;
    printf("%u %lld\n", l, (long long) ((off_t) l));
    printf("%u %llu\n", l, (unsigned long long) ((off_t) l));
    l++;
    printf("%u %lld\n", l, (long long) ((off_t) l));
    printf("%u %llu\n", l, (unsigned long long) ((off_t) l));
    return 0;
}



... and here is its output:



2147483647 2147483647
2147483647 2147483647
2147483648 -2147483648
2147483648 18446744071562067968



As you can see, the conversion goes awry above 0x7fffffff.

A cheap solution to this problem is to convert your 'unsigned long' offset to an
'off_t' and to mask the data by 'ULONG_MAX':

    #include <limits.h>

    off_t pos = ((off_t) l) & ULONG_MAX;

... which will allow you to have up to 4 GB files.

However, the better solution is to change the unsigned long data being passed
around to 'off_t'.
Comment 1 Christian Schoenebeck 2008-04-11 14:08:34 CEST
Thanks for all these hints. Unfortunately I'm totally busy with finishing my 
diploma thesis this month, but I'll take care of it in may.

Or ... hopefully somebody else does in the meantime ;-)
Comment 2 Devin Anderson 2008-04-23 08:16:30 CEST
Disregard most of my (above) analysis.  I didn't realize that you didn't have
64-bit offsets enabled.

Here is a list of diffs created against the latest CVS version to allow files
greater than 2 GB in size to be saved for POSIX systems that support 64-bit file
offsets.



configure.in diff:

38a39,40
> AC_SYS_LARGEFILE
>



src/DLS.cpp diff:

1256a1257
> #if (! DLS_LARGE_POOL_OFFSETS)
1258a1260
> #endif



src/DLS.h diff:

28a29,34
> #if POSIX && (defined _FILE_OFFSET_BITS) && (_FILE_OFFSET_BITS > 32)
> #define DLS_LARGE_POOL_OFFSETS 1
> #else
> #define DLS_LARGE_POOL_OFFSETS 0
> #endif
>



src/RIFF.cpp diff:

24,25d23
< #include "RIFF.h"
<
28a27,28
> #include "RIFF.h"
>



src/RIFF.h diff:

38a39,46
> // If config.h is available for inclusion, then it should be included before
> // other includes, as some of the data in other includes is configured there
> // (i.e. 64-bit file offsets).
>
> #ifdef HAVE_CONFIG_H
> # include <config.h>
> #endif
>
45,48d52
< #ifdef HAVE_CONFIG_H
< # include <config.h>
< #endif
<



With these diffs, I've been able to successfully save and reload .gig files that
are greater than 2 GB in length.  The patches were tested on a Fedora Core 8
system with Planet CCRMA.

I'm sure that the library still can't handle .gig files that are 4 GB or more,
given that 'unsigned long' variables are being passed around for file sizes and
offsets.

BTW, the File::Save() routine is EXTREMELY slow with huge files (I just waited
20 minutes for a 2.4 GB .gig file to save after adding fourty samples or so to a
2.2 GB file).  It works, but moving around 4096 bytes of data at a time is
time-consuming (try attaching gdb to a gigedit process while it's saving a large
file and tracing 'ulPos' in File::Save).  The File::Save(path) routine is much
faster.
Comment 3 Devin Anderson 2008-04-25 10:10:24 CEST
Created attachment 26 [details]
configure.in diff
Comment 4 Devin Anderson 2008-04-25 10:11:37 CEST
Created attachment 27 [details]
DLS.cpp diff
Comment 5 Devin Anderson 2008-04-25 10:12:25 CEST
Created attachment 28 [details]
DLS.h diff
Comment 6 Devin Anderson 2008-04-25 10:13:24 CEST
Created attachment 29 [details]
RIFF.cpp diff
Comment 7 Devin Anderson 2008-04-25 10:14:13 CEST
Created attachment 30 [details]
RIFF.h diff
Comment 8 Devin Anderson 2008-05-27 11:35:52 CEST
After looking at bug #90, I'm not sure that large file support is the correct
solution.  If I knew more about the .gig multi-file format, then I'd attempt to
work out a real fix.

Does anybody know where to find documentation on the .gig multi-file format?
Comment 9 Anders Dahnielson 2008-07-23 01:19:41 CEST
(In reply to comment #8)
> Does anybody know where to find documentation on the .gig multi-file format?

Well, apparently libgig is currently able to load .gig multi-file format. So it
should be already documented somewhat in the libgig code.

Just my 2c.
Comment 10 Devin Anderson 2008-07-23 01:47:29 CEST
(In reply to comment #9)
> (In reply to comment #8)
> > Does anybody know where to find documentation on the .gig multi-file format?
> 
> Well, apparently libgig is currently able to load .gig multi-file format. So it
> should be already documented somewhat in the libgig code.
> 
> Just my 2c.

I've taken a look at the underbelly of libgig.

The library is _not_ well documented.

It's clear that there is an indexing structure for which file in the multifile
that any particular sample can be found in.

However, there is nothing about:

1.) Whether the 2 GB limit defined in DLS.cpp is only due to the fact that file
offsets are being passed around as 'long', or because the DLS format and/or gig
formats define such a limit.  I think it's the former, as RIFF files can be 4 GB
long.
2.) What happens when samples are longer than 4 GB?  I read somewhere that newer
versions of `gigastudio` can handle samples of up to 512 GB in length.  Is
RIFF64 in effect at that point?  Are large samples spread across several files,
and - if so - how are the large samples indexed?

I had more questions back when I filed this bug, but can't remember them now.
Comment 11 Christian Schoenebeck 2008-12-03 00:04:59 CET
Do you guys think this issue should be adressed soon, or can it wait for the 
time after a next libgig release?
Comment 12 Andreas Persson 2008-12-06 11:38:52 CET
I'm very sorry for taking so long to comment on this bug.

The gig format only allows physical files to be 2 GB or smaller. A larger gig is
always split up in extension files all of which are 2 GB or smaller.

The code in DLS.cpp does unfourtunately make it look like there are 64 bit
offsets in the format. I initially thought that this was the case, but now I'm
pretty sure that pWavePoolTableHi is only used as the number of the extension
file (.gxNN). This is also how it is used in gig.cpp for loading.

As bug #90 says, libgig does not currently support saving of extension files, 
and I think that #90 is what we need to be able to save large gig files.

So, I think this report can be closed while #90 is kept open, and I think #90
will probably have to wait until after next release.
Comment 13 Christian Schoenebeck 2008-12-06 12:43:52 CET

*** This bug has been marked as a duplicate of 90 ***