[gui-dev] Re: [codepatch] Verifying-on-the-fly + large-files (v2)

Philippe Verdy

2005-01-13 14:09:27 UTC

Just a remark: in the added functions of ByteOrder.java, I note that you
have named these methods:
- "beb2uint(...)" instead of just "beb2int(...)"
(These functions return signed ints.)
- "uint2beb()" instead of just "int2beb()"
(These functions don't care about the sign, but according to the naming used
in other similar functions, the "u" should not be there)

Also some "& 255" operations are not necessary when they are followed by a
left shift so that no extra sin-extension bit remains. The beb2long() also
uses too many conversions to long, and performs or on longs, instead of just
on ints, converting the final result to long only at end. (There are minor
tweaks, not errors).

But the following function you gave is completely wrong:

public static int beb2long(final byte[] x, final int offset) {
return (((int) x[offset] & 0xFF) << 56)
| (((int) x[offset + 1] & 0xFF) << 48)
| (((int) x[offset + 2] & 0xFF) << 40)
| (((int) x[offset + 3] & 0xFF) << 32)
| (((int) x[offset + 4] & 0xFF) << 24)
| (((int) x[offset + 5] & 0xFF) << 16)
| (((int) x[offset + 6] & 0xFF) << 8)
| ((int) x[offset + 7] & 0xFF);
}

(because shifting an int by "<< 32" is a no op, the result will be that
you'll return an int (not a long as suggested in the function name) which is
the result of ORing two successive big-endian 32-bit ints.
Note that converting a byte like x[..] to an int is not necessary as it is
implicit with the "&" operator that always returns ints (also because the
second operand is also an int constant)

The correct function, with the minimum number of operations is:
public static int beb2long(final byte[] x, final int offset) {
return ( (long)( ( x[offset ] << 24)
| ((x[offset + 1] & 0xFF) << 16)
| ((x[offset + 2] & 0xFF) << 8)
| (x[offset + 3] & 0xFF)
) << 32)
| ( (long)( ( x[offset + 4] << 24)
| ((x[offset + 5] & 0xFF) << 16)
| ((x[offset + 6] & 0xFF) << 8)
| (x[offset + 7] & 0xFF)
) & 0xFFFFFFFFL);
}

The existing leb2long() functions should be written the same (no need to
use 0xFFL constants to make all operations with 64-bit numbers, but only
when appropriate, i.e. once each complete 32-bit part is computed, and needs
to be merged into a 64-bit number):
Note that Eclipse detects many such unnecessary conversions to long in the
existing code, because a byte is implicitly converted to long if it is ANDed
with a 64-bit contant like 0xFFL; if you remove these extra unneeded
conversions, you see immediately that using 0xFFL is not efficient, because
it forces the runtime to convert all its operands to 64-bit with extra
opcodes, and it uses more space in the operand stack, and disables many
interesting optimizations in the JIT compiler, notably when inlining now
longer functions, and it forbifs making most of the operations within native
registers, forcing a higher usage of native stack space).

public static long leb2long(final byte[] x, final int offset, final int
n)
throws IndexOutOfBoundsException, IllegalArgumentException {
switch (n) {
case 1:
return (long)( x[offset ] & 0xFF

);
case 2:
return (long)( ( x[offset ] & 0xFF )
| ((x[offset + 1] & 0xFF) << 8)
);
case 3:
return (long)( ( x[offset ] & 0xFF )
| ((x[offset + 1] & 0xFF) << 8)
| ((x[offset + 2] & 0xFF) << 16)
);
case 4:
return (long)( ( x[offset ] & 0xFF )
| ((x[offset + 1] & 0xFF) << 8)
| ((x[offset + 2] & 0xFF) << 16)
| ( x[offset + 3] << 24)
) & 0xFFFFFFFFL;

case 5:
return ( (long)( ( x[offset ] & 0xFF )
| ((x[offset + 1] & 0xFF) << 8)
| ((x[offset + 2] & 0xFF) << 16)
| ( x[offset + 3] << 24)
) & 0xFFFFFFFFL)
| ( (long)( ( x[offset + 4] & 0xFF )

) << 32);
case 6:
return ( (long)( ( x[offset ] & 0xFF )
| ((x[offset + 1] & 0xFF) << 8)
| ((x[offset + 2] & 0xFF) << 16)
| ( x[offset + 3] << 24)
) & 0xFFFFFFFFL)
| ( (long)( ( x[offset + 4] & 0xFF )
| ((x[offset + 5] & 0xFF) << 8)
) << 32);
case 7:
return ( (long)( ( x[offset ] & 0xFF )
| ((x[offset + 1] & 0xFF) << 8)
| ((x[offset + 2] & 0xFF) << 16)
| ( x[offset + 3] << 24)
) & 0xFFFFFFFFL)
| ( (long)( ( x[offset + 4] & 0xFF )
| ((x[offset + 5] & 0xFF) << 8)
| ((x[offset + 6] & 0xFF) << 16)
) << 32);
case 8:
return ( (long)( ( x[offset ] & 0xFF )
| ((x[offset + 1] & 0xFF) << 8)
| ((x[offset + 2] & 0xFF) << 16)
| ( x[offset + 3] << 24)
) & 0xFFFFFFFFL)
| ( (long)( ( x[offset + 4] & 0xFF )
| ((x[offset + 5] & 0xFF) << 8)
| ((x[offset + 6] & 0xFF) << 16)
| ( x[offset + 7] << 24)
) << 32);
default:
throw new IllegalArgumentException("No bytes specified");
}
}

These functions in ByteOrder are so often used in our LimeWire code that
allowing them to be inlined by JIT as much as possible will benefit the
performance. The key issue is then to make these functions as short as
possible, also because it allows better performance of the CPU-to-memory
local cache with more available time slots for defered writes to memory.

Well if we really want to gain performance in LimeWire, we should really
make efforts to reduce the number of temporary immutable strings or byte
sub-arrays we allocate for parsing/formating messages. Reusable indexed byte
buffers would greatly enhance the performance and reduce the global memory
footprint. This is one key issue for routing performance in LimeWire.

This is along with the too large space used with large sparse bitsets when
running in UltraPeer mode; something on which I am working to dynamically
size bitsets according to their selectivity: to improve data locality, using
a smaller "root" bitset containing storage mode bits for sub-bitsets should
limit a lot the parsing of subbitsets, which can then adopt a compressed
representation

(I'm working on it so that the root bitset will have a max size of 64Kbit
with no sub-bitset, but so that it can be 1Kbit, where many small
sub-bitsets are compressed. For a standard QRP table of 64Kbit, QRP tables
would then be represented with 1Kbit for fill rates below 2%. But if a QRP
table becomes 1Mbit, a 1Kbit root bitset would store the same number of keys
with an equal selectivity, but with a fill factor below 0.12%, the
subbitsets would be up to 16times 64-bits, which are highly compressible.

My idea is then to not compress the root bitset so that it keeps a direct
access and a good enough selectivity, but to compress all the subbitsets
that will have O(n) sequential access time.

I am testing several strategies to represent all the subbitsets into a
single array, using a RLE or deflate-LZW-like representation: it seems that
deflate offers the best compression, as it inherently avoids storing an
additional index to reference the begining of each subbitset. My final
solution will more likely look like an Integer-Trie with variable index
bitsize (the complexity comes from the way I will allow the root bitset to
reference efficiently subbitsets, but I want to keep a direct access to
every bit of the root bitset, to see if a bit is in one of two cases:
PRESENT or ABSENT, using the two values PROBABLYPRESENT and PROBABLYABSENT
on the rarest cases requiring an access to the slow O(n) subbitsets).

----- Original Message -----
From: "Gregorio Roper" <***@gmx.li>
To: <***@limewire.org>
Sent: Tuesday, January 11, 2005 6:43 PM
Subject: [codepatch] Verifying-on-the-fly + large-files (v2)

As usual, I modified a lot more code than necessary, but I think I could
split the patch into several smaller patches. The
* LimeWire now basically supports file-sizes > 2GB (need to test)
* VerifyingFile was heavily modified, to allow requesting/verifying
TTH-chunks
* Most of the hashtree handling (except for the requesting) was moved
from ManagedDownloader to VerifyingFile
* Most of the work, previously done by assignWhite/assignGrey has been
moved to VerifyingFile because it seemed convenient
* In order to make the handling of verified intervals more efficient I'm
now using an IntervalSet based on a BitSet, that I named BitField (which
I'm also using for my BitTorrent implementation). I added some
convenient methods to the IntervalSet.
* RemoteHostData now stores the busy-state, host-address, port, guid,
push-address, etc. pp. and one RemoteHostData object is now shared
among RemoteFileDescs that come from the same host as identified by
client-GUID (or host & port if there is no client-GUID). To avoid
confusion I removed the 00000...-GUID Alt-locs were using and replaced
it with a clean *null*. The idea of sharing the host-address & port in
the RHD object may still prove to be very stupid - I'm not sure about it
yet.
* The ManagedDownloader now prevents hammering from hosts, in case we
are downloading many files from the same host, by setting the
RemoteHostData object associated with the host to busy, whenever it
connects. In addition, one busy signal from that host is now valid for
all manageddownloaders downloading from it.
* The dialog for corrupted downloads, the several corruption states and
overlap-checking were removed from the downloading code, because they
were not really compatible with my other changes :-). The GUI will now
simply display the 'file corrupted' message as download status, if the
SHA-1 does not match and there is no hashtree or it failed too many
times in a row. The download, however, will go into corrupted-state as
long as the file is incomplete.
* The VerifyingFile now remember which host uploaded corrupt ranges. I
have not yet made up my mind whether not to request from such a host at
all or whether just to request another chunk from such a host and in
what intervals to allow retrying such a host.
mfg
gregorio
_______________________________________________
codepatch mailing list
http://www.limewire.org/mailman/listinfo/codepatch