我正在尝试仅从Google表格中的单元格提取域名。该单元格可以具有.com或.co.uk扩展名。
它们始终以www或https://的形式显示为裸域。
列A将包含一个随机URL列表,这些列表将显示为aol.com或amazon.co.uk
我已经尝试了以下和许多变体,我认为这实质上是“删除点之前的所有内容”,而regex实际上使用点来执行其职责。因此,这可能是造成潜在冲突的原因。
=REGEXEXTRACT(A4, "(.+).")
结果最终将通过管道传送到查询公司名称的QUERY。
=query(Companies!A2:F,"select A where F contains '"&A2&"'")
Since you may have only
.com
or.co.uk
at the end of the strings, you may useSee the regex demo.
另外,您可以在结尾处将其删除
See another regex demo
您可能还会考虑一些更通用的模式,例如
图案细节
^
- start of string(.+)
- 1 or more chars other than line break chars, as many as possible(.+?)
- 1 or more chars other than line break chars, as few as possible (needed in the more generic patterns because the subsequent pattern is optional)\.(?:co\.uk|com)$
-.
and thenco.uk
orcom
at the end of the string(?:\.co)?\.[^.]+$
- an optional.co
char sequence and then.
and 1 or more chars other than a.
till the end of the string.