Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit I didn't post the code, nobody ever got back to me. What do you think, shall I post it to the list or what? Having read the list the last few days, it's more of a questions list and not a dev list so I wasn't sure if it was the correct venu. Are you one of the developers of dcc? I've attached 2 files, essence4.c and essence6.c. The main difference is that essence6 is sensitive to line breaks. It's been several weeks since I looked at this stuff, I sure hope I'm sending you the correct files! I did most of my testing with essence6. I have 9 different functions I played with, so there's some room for confusion. You'll need to get the gnu arbitrary percision math lib, libgmp from your favorite gnu server. I'd be happy to put one or both of these sums into a form that can be included into the dcc project if there's interest. At the moment, they read stdin and print out a relatively long number on stdout. Try adding or deleting a line from the imput file and running it again. The output should be the same or very similar down to nearly the final digits. Obviously these fuzzy sums work best with large files and few mods. The good news is that you can always chop the number at a certain number of digits to make them fuzzier. You'll have to do that anyway since dns limits the overall length of a name which can be searched for. Also need to compact the number to use all available bits. I'm perfectly willing to do this, I just haven't yet since I was first experimenting with fuzzy functions first. Let me know what you think. -Mike p.s. also, please don't repost this with my email address, use http://www.grant.org/~mg-dcc instead, cheers. Content-Type: text/plain; charset=us-ascii; name="essence4.c" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="essence4.c" /* essence.c -*- compile-command: "cc essence.c -lgmp -o essence" -*- */ /* Copyright (c) 2002 Michael Grant <http://www.grant.org/~MGrant> */ /* digest = root mean square of space separated words treated as numbers */ /* sqrt( W^2 + W^2 + W^2 + ... ) */ #include <stdio.h> #include <string.h> #include <gmp.h> main(int argc, char **argv) { FILE *fp; char s[1024]; char *cp, *tp; mpz_t n, sum; char *csum = NULL; if (argc > 1) { fp = fopen(argv[1], "r"); if (fp==NULL) { perror(argv[1]); exit(1); } } else { fp = stdin; } mpz_init(n); mpz_init(sum); while (fgets(s, sizeof(s), fp) != NULL) { for (cp=strtok(s," /t/n/r"); cp && *cp; cp=strtok(NULL," /t/n/r")) { mpz_set_ui(n, 0); /* n = 0 */ for (tp=cp; *tp; tp++) { mpz_mul_2exp(n, n, 8); /* n = n<<8 */ mpz_add_ui(n, n, (unsigned long)*tp); /* n = n + *tp */ } mpz_mul(n, n, n); /* n = n**2 */ mpz_add(sum, sum, n); /* sum = sum + n */ } } mpz_sqrt(sum, sum); printf("%s\n", mpz_get_str(csum, 10, sum)); } Content-Type: text/plain; charset=us-ascii; name="essence6.c" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="essence6.c" /* essence.c -*- compile-command: "cc essence.c -lgmp -o essence" -*- */ /* Copyright (c) 2002 Michael Grant <http://www.grant.org/~MGrant> */ /* digest = root mean square of space separated words treated as numbers */ /* sqrt( (W+W+W...W)^2 + (W+W+W...W)^2 + ... ) */ #include <stdio.h> #include <string.h> #include <gmp.h> main(int argc, char **argv) { FILE *fp; char s[1024]; char *cp, *tp; mpz_t n, sum_words, sum_lines; if (argc > 1) { fp = fopen(argv[1], "r"); if (fp==NULL) { perror(argv[1]); exit(1); } } else { fp = stdin; } mpz_init(n); mpz_init(sum_words); mpz_init(sum_lines); while (fgets(s, sizeof(s), fp) != NULL) { mpz_set_ui(sum_words, 0); /* sum = 0 */ for (cp=strtok(s," /t/n/r"); cp && *cp; cp=strtok(NULL," /t/n/r")) { mpz_set_ui(n, 0); /* n = 0 */ for (tp=cp; *tp; tp++) { mpz_mul_2exp(n, n, 8); /* n = n<<8 */ mpz_add_ui(n, n, (unsigned long)*tp); /* n = n + *tp */ } mpz_add(sum_words, sum_words, n); /* sum = sum + n */ } mpz_mul(sum_words, sum_words, sum_words); /* sum = sum**2 */ mpz_add(sum_lines, sum_lines, sum_words); /* lines += sum */ } mpz_sqrt(sum_lines, sum_lines); printf("%s\n", mpz_get_str(NULL, 10, sum_lines)); }