环境:ruby 1.9
\1 和 $1 在用 ruby 正则的时候 经常会用到 , 那么有什么区别呢,今天 来梳理一下:
\1 : 是 向后引用 , 常使用在 sub , gsub 中
$1 : 是 ruby 里的全局变量
看几个demo:
demo:
"ab12cd12".gsub(/(\d+)cd(\1)/,"") # => "ab" 这个正则就相当于 /(\d+)cd12/ ,因为 \1 引用的是 前面的 (\d+) ,而前面的 (\d+) 匹配出来的结果是 12 "ab12cd".gsub(/(\d+)/,'34\1') # => "ab3412cd"
p "ab12cd".gsub(/(\d+)/,'34\1') # "ab3412cd" p $1 # "12" p "ab56cd".gsub(/(\d+)/,"78#{$1}") # "ab7812cd" , 这个时候的 $1 为 上面的 12
p "ab12cd".gsub(/(\d+)/,'34\1') # "ab3412cd" p $1 # "12" str = "ab56cd".gsub(/(\d+)/) do |ele| "78#{$1}" # 这里的 $1 是 56 end p str # "ab7856cd"
p "ab56cd".gsub(/(\d+)/,"78#{$1}") # "ab78cd" 这里的 $1 是 nil
str = "ab56cd".gsub(/(\d+)/) do |ele| "78#{$1}" end p str # "ab7856cd"
得出结论:
1,\1 和 $1 是两个 不同的用法
2,特别注意 $1 在 gsub中 block 中,和 写在replacement 中 是不一样的 , \1 用在 replacement 中 ,$1 用在 block 中 ,这个源码中已经说明了
3,\1 必须用单引号
看下源码中的解释:
# str.gsub(pattern, replacement) => new_str # str.gsub(pattern) {|match| block } => new_str # # # Returns a copy of <i>str</i> with <em>all</em> occurrences of <i>pattern</i> # replaced with either <i>replacement</i> or the value of the block. The # <i>pattern</i> will typically be a <code>Regexp</code>; if it is a # <code>String</code> then no regular expression metacharacters will be # interpreted (that is <code>/\d/</code> will match a digit, but # <code>'\d'</code> will match a backslash followed by a 'd'). # # If a string is used as the replacement, special variables from the match # (such as <code>$&</code> and <code>$1</code>) cannot be substituted into it, # as substitution into the string occurs before the pattern match # starts. However, the sequences <code>\1</code>, <code>\2</code>, and so on # may be used to interpolate successive groups in the match. # # In the block form, the current match string is passed in as a parameter, and # variables such as <code>$1</code>, <code>$2</code>, <code>$`</code>, # <code>$&</code>, and <code>$'</code> will be set appropriately. The value # returned by the block will be substituted for the match on each call. # # The result inherits any tainting in the original string or any supplied # replacement string. # # "hello".gsub(/[aeiou]/, '*') #=> "h*ll*" # "hello".gsub(/([aeiou])/, '<\1>') #=> "h<e>ll<o>" # "hello".gsub(/./) {|s| s[0].to_s + ' '} #=> "104 101 108 108 111 " # # def gsub(pattern, replacement) # This is just a stub for a builtin Ruby method. # See the top of this file for more info. end
replacement 时:
# If a string is used as the replacement, special variables from the match
# (such as <code>$&</code> and <code>$1</code>) cannot be substituted into it,
# as substitution into the string occurs before the pattern match
# starts. However, the sequences <code>\1</code>, <code>\2</code>, and so on
# may be used to interpolate successive groups in the match.
block 时:
# In the block form, the current match string is passed in as a parameter, and
# variables such as <code>$1</code>, <code>$2</code>, <code>$`</code>,
# <code>$&</code>, and <code>$’</code> will be set appropriately. The value
# returned by the block will be substituted for the match on each call.
多行匹配也很常见,例如截取html源码的时候很有用,实现多行匹配只需要在正则后面加m即可 例如
/http:(.*?)\s?/m
ruby demo:
1
p "ab\r\n334cd".match(/ab(.*)cd/) # nil p "ab\r\n334cd".match(/ab(.*)cd/m) # #<MatchData "ab\r\n334cd" 1:"\r\n334">
2,
str = "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" \"http://www.w3.org/TR/REC-html40/loose.dtd\">\n<html><body><pre>\n<code class=\"ruby\">\nputs \"hello world\"\nputs \"hello world\"\n</code>\n</pre></body></html>\n" p str.match(/<body>(.*)<\/body>/m)[0] # "<body><pre>\n<code class=\"ruby\">\nputs \"hello world\"\nputs \"hello world\"\n</code>\n</pre></body>" p str.match(/<body>(.*)<\/body>/)[0] # nil