Consider a table with just two colums, say
I would like to find the fastest way to compute all the rows having
col2 equal to the
col2 column of a row having
col1 equal to a given value.
For example, consider the case where you want to find the names of all the people having the same surname of a person whose name is 'John'
SELECT t.* FROM table t JOIN table t2 ON t2.col2 = t.col2 WHERE t2.col1 = 'some value'
You will need indexes on both columns to get good performance.
Basically you need this query:
select t1.col1 from table t1, table t2 where t2.col1='John' and t1.col2=t2.col2
If you add an index to the col1, col2 fields it should be fast enough.
If you would like to generate an output table for huge amount of data, then probably an interim create table would be the fastest.
In SQL it is usually fastest, when you just write what you want. You want all records from tablex where col2 is in a certain set. This set is all col2 where col1 is 'John'.
select * from tablex where col2 in (select col2 from tablex where col1 = 'John');
An index on col1 will help you find John quickly. An index on col2 will help you find the others.
There are many ways to select the data you want. Modern dbms look into your query and find the best way to select the data. So it may be you re-write the query but the dbms notices it is the same query actually and uses the same execution plan.
So your first aim should be: Make the query readable. Here it is two things you do: Find John (inner query), find the others (outer query). So you don't join the tables, because you want to see only records from tablex with certain last names. Inside, the dbms may very well decide to join tablex with itself and then filter afterwards. But this is up to the dbms and doesn't have to bother you. (Although it's good to know that you can achieve the same with a self-join, an exists clause, and maybe other ways, too.)