Comments on db4free.net blog: Filling table with prime numbers

Oh, my bad, I forgot 1 is not prime.We could also ...

2008-06-04T00:25:00.000+02:00

Oh, my bad, I forgot 1 is not prime.

We could also filter out multiples of two, but that's left as an exercise for the reader. ;)

This version will also factor out the division, possible casting, and
floor function, replacing all that with the modulus operation (if your RDBMS does not support operator %, it probably supports function mod():

select * from artificial_range a
where a.id != 1 and not exists ( select * from artificial_range b where b.id != 1 and b.id < a.id and a.id % b.id = 0 );

Why not just use SQL to do this:First, create a ta...

2008-06-04T00:15:00.000+02:00

Why not just use SQL to do this:

First, create a table that is just a range of numbers:

create table artificial_range (id int not null primary key auto_increment, idz int not null default 0);

insert into artificial_range(idz) values (0);

insert into artificial_range (idz) select idz from artificial_range;

Repeat the last statement, doubling the table size each time, until it reaches the length you want.

Now, assuming that your RBDMS returns a float type for integral division, just do this:

create view artifical_range_prime as
select * from artificial_range a
where not exists ( select * from artificial_range b where b.id != 1 and b.id < a.id and a.id/ b.id = floor(a.id / b.id));

Note that the predicate 'and b.id < a.id' is just to optimize the process a bit.

If your RBDMS does not return a float type for integral division, just cast one of the operand (a.id or b.id) to float prior to the divisions.

Now the slow part is filling the table. That can b...

2006-06-06T10:03:00.000+02:00

Now the slow part is filling the table. That can be improved dramatically by using a doubling algorithm:

DELIMITER //
CREATE PROCEDURE fill_sieve(max INT)
DETERMINISTIC
BEGIN
DECLARE previous_max INT DEFAULT 1;

TRUNCATE sieve;

INSERT INTO sieve (id) VALUES (2);

WHILE previous_max <= max DO
INSERT INTO sieve (id) SELECT id + previous_max FROM sieve;
SET previous_max = previous_max * 2;
END WHILE;

DELETE FROM sieve WHERE id > max;

END //
DELIMITER ;

Using that method of filling brings the time to less than a second.

mysql> call sieve5(20000);
Query OK, 0 rows affected (0.63 sec)

Which can be improved further by using a memory table:

mysql> call sieve6(20000);
Query OK, 0 rows affected (0.24 sec)

-robin

It also occurred to me that you only really need t...

2006-06-05T13:41:00.000+02:00

It also occurred to me that you only really need to loop through while the counter is < SQRT(max).

So adding that in to my previous solution:

DELIMITER //
CREATE PROCEDURE sieve3 (max INT)
DETERMINISTIC
BEGIN
DECLARE max_root INT DEFAULT CEIL(SQRT(max));
DECLARE prime INT DEFAULT 1;

CALL fill_sieve(max);

WHILE NOT ISNULL(prime) AND prime < max_root DO
SELECT MIN(id) INTO prime FROM sieve WHERE id > prime;
DELETE FROM sieve WHERE id > prime AND (id % prime) = 0;
END WHILE;

END //
DELIMITER ;

Timings:

mysql> call sieve1(20000); # original
Query OK, 0 rows affected (11.63 sec)

mysql> call sieve2(20000); # first try
Query OK, 0 rows affected (7.47 sec)

mysql> call sieve3(20000); # with sqrt
Query OK, 0 rows affected (1.62 sec)

OK, how about this? I've not bothered with improvi...

2006-06-02T21:16:00.000+02:00

OK, how about this? I've not bothered with improving the population of the sieve but I've hoicked it out into a separate procedure to tidy things up.

When 'max' is 20000, it takes about 4.27 sec to the original's 11.75 sec on my machine.

-robin

DELIMITER //

CREATE PROCEDURE fill_sieve(max INT)
DETERMINISTIC
BEGIN
DECLARE counter INT DEFAULT 2;

TRUNCATE sieve;
WHILE counter < max DO
INSERT INTO sieve (id) VALUES (counter);
SET counter = counter + 1;
END WHILE;

END //

CREATE PROCEDURE sieve (max INT)
DETERMINISTIC
BEGIN
DECLARE counter INT DEFAULT 1;

CALL fill_sieve(max);

WHILE NOT ISNULL(counter) DO
SELECT MIN(id) INTO counter FROM sieve WHERE id > counter;
DELETE FROM sieve WHERE id > counter AND (id % counter) = 0;
END WHILE;

END //

DELIMITER ;