2

I want to obtain root domains from a list of URLs with REGEXEXTRACT but want the list of all possible TLDs to come from another sheet, to avoid specifying com|net|org|all other by hand. Basic idea:

  =REGEXEXTRACT(A2, "\w+\.(?:com|net|org)")

What I want:

  =REGEXEXTRACT(A2, "\w+\.(?:IMPORTRANGE("$SHEETKEY",!A2:A))")

Any way to achieve this?

pnuts
  • 17,883
  • 5
  • 55
  • 103
dzhi
  • 129
  • 4

2 Answers2

1

You need to do this.

Formula

new Google Sheet

=REGEXEXTRACT(IMPORTRANGE("URL_IMPORT_SHEET", "A2"), "(\w+\.\w+)$")

old Google Spreadsheet

=REGEXEXTRACT(IMPORTRANGE("KEY_IMPORT_SHEET", "A2"), "(\w+\.\w+)$")

For a range you need to do this.

=ARRAYFORMULA(REGEXEXTRACT(IMPORTRANGE("URL_OR_KEY", "A2:A14"), "(\w+\.\w+)$"))

References

Jacob Jan
  • 23,322
  • 16
  • 106
  • 195
0

Guessing that A2 contains something like https://webapps.stackexchange.com and that another sheet contains something like: A2 com, A3 net, A4 org then in say B1:

=TEXTJOIN("|",1,IMPORTRANGE(" k e y ","A2:A"))

should return something like com|net|org provided permission is granted.

This may be applied within a REGEXEXTRACT formula like so:

 =REGEXEXTRACT(A2,"\w+\.(?:"&B$1&")")

to return, for the example: stackexchange.com (a link to https://stackexchange.com/) and for https://www.draw.io/ return #N/A.

TEXTJOIN
IMPORTRANGE
REGEXEXTRACT

pnuts
  • 17,883
  • 5
  • 55
  • 103