How to Extract Root Domain from URL in Excel

If you manage hundreds of domains and thousands of webpages, you may be as well using a list of webpages with different structures, and that’s the reason why you may need to extract the root domain from each webpage, using an Excel formula. A same webpage may be referred in many ways:

http://danielpinero.com/index.php
http://www.danielpinero.com
http://danielpinero.com
www.danielpinero.com
danielpinero.com

I’ve created a formula to detect only the root domain under many possible variations. This formula cleans the URL from "http://", "www.", subdirectories and files. In case of subdomains, (for example, es.danielpinero.com), the formula keeps intact the first segment (es.)

To insert it in your Excel spreadsheet, you must:

1) Download MyExtract, a special function that allows breaking down a string into many parts. Copy it from MyExtract website. In Excel, go to View > Macros > write "myextract" > Create. Clean the preexistent text, paste the function you’ve just copied y and close the Excel VBA editor. The function won’t be visible as macro, but it has been recorded inside the Excel workbook and will be available as any other standard function.

2) Afterwards, place your URL in cell A1. Then, in cell B1, copy and paste this formula:

=SUBSTITUTE(IF(SUBSTITUTE(SUBSTITUTE(IF(RIGHT(MyExtract(A1;3;"F";"/");
1)="/";LEFT(MyExtract(A1;3;"F";"/");LEN(MyExtract(A1;3;"F";"/"))-1);MyExtract(A1;3;"F";"/"));
"http://";"";1);"www.";"";1)="*";A1;SUBSTITUTE(SUBSTITUTE(IF(RIGHT(MyExtract(A1;3;"F";"/");
1)="/";LEFT(MyExtract(A1;3;"F";"/");LEN(MyExtract(A1;3;"F";"/"))-1);MyExtract(A1;3;"F";"/"));
"http://";"";1);"www.";""));"www.";"")

The root domain has been extracted. You can extend this formula to the remaining cells, or save all this work by downloading it here: formula to extract root domain. If you’re using Excel 2010, remember that you must save this workbook as .xlsm

Contribute with a comment comment3 Comments
  1. 2015-04-17 Andy Morley

    This only seems to work if the subdomain is www. it doesnt work for xyz. or abc. for example... In that case its a lot faster to simply find and replace www. with "blank". Andy

    2013-10-12 Eric

    Hi Daniel,I've tried it, but it doesn't get rid of subdomains, which is exactly what I need. Not sure if I'm doing something wrong or the formula is not doing it.Any help would be appreciated.Thanks for sharing this! :)

    • 2013-10-13 Daniel

      Hi Eric. Can you send the Excel file to me, so I can check it out? Thanks.

    2013-11-06 Yann

    Hi, I cannot find a way to keep only the host part. I need to remove the subdomain, so for es.danielpinero.com I actually need to return danielpinero.com Any idea ?

mode_editContribute with a comment

menu
menu