IPP> ADM - Minutes from PWG IPP Phone Conference - 98011

IPP> ADM - Minutes from PWG IPP Phone Conference - 98011

Carl Kugler kugler at us.ibm.com
Tue Jan 20 12:22:07 EST 1998


I had to see for myself so I've tried to work out some rough numbers.


Assumptions:
  1) The boundary delimiter has the maximum legal length of 70 characte=
rs.
  2) The boundary delimiter and the encapsulated data are generated by =
random,
uncorrelated processes.
  3) The encapsulated data is a string of 8-bit octets
  4) The encapsulated data is more than 70 octets long.


Let N be the number of octets in the encapsulated data.
The number of substrings of length 70 in the encapsulated data is N - 7=
0 - 1 or
N - 71.
A substring matches the boundary delimiter with probability (1 / 256) ^=
 70,
since there are 256 possibilities for each character and all characters=
 have to
match, in order.
The expected number of matches is therefore (N - 71) * (1/256) ^ 70 or
(N - 71) / (256 ^ 70).


So, for example, transferring 1 GB files, you'd expect (1E9 - 71) / (25=
6 **
70)  or  2.6E-160 failures per submission, which works out to a failure=
 rate of
1 in  3.77E+159 trials.  Of course, the failure rate is lower for small=
er
files;  1 in 3.77E+162 for 1 MB files.


If we challenge assumption 3 and say we're transferring 1 GB 7bit ASCII=
 files,
then the failure rate increases to 1 in 1.85E+138.


In conclusion, I'd have to agree that this probability is insignificant=
 (if my
assumptions are valid and I've done the math right).




  -Carl






ipp-owner at pwg.org on 01/19/98 10:58:54 PM
Please respond to ipp-owner at pwg.org @ internet
To: Carl Kugler/Boulder/IBM at ibmus
cc: ipp at pwg.org @ internet
Subject: Re: IPP> ADM - Minutes from PWG IPP Phone Conference - 98011




> The weakness with the MIME way is that it's either unsafe or slow -- =
either
you > arbitrarily pick a boundary string and hope that it doesn't appea=
r in the
>
binary data, or you prescan the data to make sure.  Content-length avoi=
ds those
>
problems.


Actually, the fact of the matter is that it doesn't have to be either -=
- it is
quite easy to generate boundaries which in practice are so statisticall=
y
unlikely to ever appear in the message text that the chances of, say me=
ssage
corruption as a result of undetected network errors are many orders of
magnitude greater.


    Ned




=



More information about the Ipp mailing list