In import_ref_by_revs() there is this bit that, on the initial clone, will end up fetching way too many pages:
my $revision_ids = [ $fetch_from .. $last_remote ]; return $self->import_revids( $fetch_from, $revision_ids, $pages );
$fetch_from is 0 on the first time and $last_remote is last revid for the most recent page. We do not want all pages, we only want the pages that apply to our subset.