模块:String2

来自决策链云智库
Zeroclanzhang讨论 | 贡献2023年7月21日 (五) 21:32的版本 (创建页面,内容为“local p = {} p.upper = function(frame) local s = mw.text.trim(frame.args[1] or "") return string.upper(s) end p.lower = function(frame) local s = mw.text.trim(frame.args[1] or "") return string.lower(s) end p.sentence = function (frame ) frame.args[1] = string.lower(frame.args[1]) return p.ucfirst(frame) end p.ucfirst = function (frame ) local s = mw.text.trim( frame.args[1] or "" ) local s1 = "" -- if it's a list chop off and (store as s1) eve…”)
(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)

模板:Lmd

The module String2 contains a number of string manipulation functions that are much less commonly used than those in Module:String. Because Module:String is cascade-protected (some of its functions are used on the Main Page), it cannot be edited or maintained by template editors, only by admins. While it is true that string-handling functions rarely need maintenance, it is useful to allow that by template editors where possible, so this module may be used by template editors to develop novel functionality.

The module contains three case-related calls that convert strings to first letter uppercase, sentence case or title case and two calls that are useful for working with substrings. There are other utility calls that strip leading zeros from padded numbers and transform text so that it is not interpreted as wikitext, and several other calls that solve specific problems for template developers such as finding the position of a piece of text on a given page.

The functions are designed with the possibility of working with text returned from Wikidata in mind. However, a call to Wikidata may return empty, so the functions should generally fail gracefully if supplied with a missing or blank input parameter, rather than throwing an error.

Functions

trim

The trim function simply trims whitespace characters from the start and end of the string.

title

The title function capitalises the first letter of each word in the text, apart from a number of short words listed in The U.S. Government Printing Office Style Manual §3.49 "Center and side heads": a, an, the, at, by, for, in, of, on, to, up, and, as, but, or, and nor.

This is a very simplistic algorithm; see Template:Title case/doc for some of its limitations.

sentence

The sentence function finds the first letter and capitalises it, then renders the rest of the text in lower case. It works properly with text containing wiki markup. Compare {{#invoke:String2|sentence|[[action game]]}}Action game with {{ucfirst:{{lc:[[action game]]}}}}action game. Piped wiki-links are handled as well:

  • {{#invoke:String2|sentence|[[trimix (breathing gas)|trimix]]}}Trimix

So are lists:

  • {{#invoke:String2 |sentence |{{hlist ||[[apples]] |[[pears]] |[[oranges]]}}}}

ucfirst

The ucfirst function is similar to sentence; it renders the first alphabetical character in upper case, but leaves the capitalisation of the rest of the text unaltered. This is useful if the text contains proper nouns, but it will not regularise sentences that are ALLCAPS, for example. It also works with text containing piped wiki-links and with html lists.

findlast

  • Function findlast finds the last item in a list.
  • The first unnamed parameter is the list. The list is trimmed of leading and trailing whitespace
  • The second, optional unnamed parameter is the list separator (default = comma space). The separator is not trimmed of leading and trailing whitespace (so that leading or trailing spaces can be used).
  • It returns the whole list if the separator is not found.

One potential issue is that using Lua special pattern characters (^$()%.[]*+-?) as the separator will probably cause problems.

Examples
Case Wikitext Output
Normal usage {{#invoke:String2 |findlast | 5, 932, 992,532, 6,074,702, 6,145,291}} 脚本错误:函数“findlast”不存在。
Space as separator {{#invoke:String2 |findlast | 5 932 992,532 6,074,702 6,145,291 }} 脚本错误:函数“findlast”不存在。
One item list {{#invoke:String2 |findlast | 6,074,702 }} 脚本错误:函数“findlast”不存在。
Separator not found {{#invoke:String2 |findlast | 5, 932, 992,532, 6,074,702, 6,145,291 |;}} 脚本错误:函数“findlast”不存在。
List missing {{#invoke:String2 |findlast |}} 脚本错误:函数“findlast”不存在。

split

The split function splits text at boundaries specified by separator and returns the chunk for the index idx (starting at 1). It can use positional parameters or named parameters (but these should not be mixed):

Usage
{{#invoke:String2 |split |text |separator |index |true/false}}
{{#invoke:String2 |split |txt=text |sep=separator |idx=index |plain=true/false}}

Any double quotes (") in the separator parameter are stripped out, which allows spaces and wikitext like ["[ to be passed. Use {{!}} for the pipe character |.

If the optional plain parameter is set to false / no / 0 then separator is treated as a Lua pattern. The default is plain=true, i.e. normal text matching.

The index parameter is optional; it defaults to the first chunk of text.

The Template:Stringsplit is a convenience wrapper for the split function.

stripZeros

The stripZeros functions finds the first number in a string of text and strips leading zeros, but retains a zero which is followed by a decimal point. For example: "0940" → "940"; "Year: 0023" → "Year: 23"; "00.12" → "0.12"

nowiki

The nowiki function ensures that a string of text is treated by the MediaWiki software as just a string, not code. It trims leading and trailing whitespace.

val2percent

The val2percent functions scans through a string, passed as either the first unnamed parameter or |txt=, and converts each number it finds into a percentage, then returns the resulting string.

one2a

The one2a function scans through a string, passed as either the first unnamed parameter or |txt=, and converts each occurrence of 'one ' into either 'a ' or 'an ', then returns the resultant string.

The Template:One2a is a convenience wrapper for the one2a function.

findpagetext

The findpagetext function returns the position of a piece of text in the wikitext source of a page. It takes up to four parameters:

  • First positional parameter or |text is the text to be searched for.
  • Optional parameter |title is the page title, defaults to the current page.
  • Optional parameter |plain is either true for a plain search (default), or false for a Lua pattern search.
  • Optional parameter |nomatch is the value returned when no match is found; default is nothing.
Examples
{{#invoke:String2 |findpagetext |text=Youghiogheny}}脚本错误:函数“findpagetext”不存在。
{{#invoke:String2 |findpagetext |text=Youghiogheny |nomatch=not found}}脚本错误:函数“findpagetext”不存在。
{{#invoke:String2 |findpagetext |text=Youghiogheny |title=Boston Bridge |nomatch=not found}}脚本错误:函数“findpagetext”不存在。
{{#invoke:String2 |findpagetext |text=river |title=Boston Bridge |nomatch=not found}}脚本错误:函数“findpagetext”不存在。
{{#invoke:String2 |findpagetext |text=[Rr]iver |title=Boston Bridge |plain=false |nomatch=not found}}脚本错误:函数“findpagetext”不存在。
{{#invoke:String2 |findpagetext |text=%[%[ |title=Boston Bridge |plain=f |nomatch=not found}}脚本错误:函数“findpagetext”不存在。
{{#invoke:String2 |findpagetext |text=%{%{[Cc]oord |title=Boston Bridge |plain=f |nomatch=not found}}脚本错误:函数“findpagetext”不存在。

The search is case-sensitive, so Lua pattern matching is needed to find river or River. The last example finds {{coord and {{Coord. The penultimate example finds a wiki-link.

The Template:Findpagetext is a convenience wrapper for this function.

strip

The strip function strips the first positional parameter of the characters or pattern supplied in the second positional parameter.

Usage
{{#invoke:String2|strip|source_string|characters_to_strip|plain_flag}}
{{#invoke:String2|strip|source=|chars=|plain=}}
Examples
{{#invoke:String2|strip|abc123def|123}}脚本错误:函数“strip”不存在。
{{#invoke:String2|strip|abc123def|%d+|false}}脚本错误:函数“strip”不存在。
{{#invoke:String2|strip|source=abc123def|chars=123}}脚本错误:函数“strip”不存在。
{{#invoke:String2|strip|source=abc123def|chars=%d+|plain=false}}脚本错误:函数“strip”不存在。

matchAny

The matchAny function returns the index of the first positional parameter to match the source parameter. If the plain parameter is set to false (default true) then the search strings are Lua patterns. This can usefully be put in a switch statement to pick a switch case based on which pattern a string matches. Returns the empty string if nothing matches, for use in {{#if}}.

{{#invoke:String2|matchAny|123|abc|source=abc 124}} returns 2.

hyphen2dash

Extracted hyphen_to_dash() function from Module:Citation/CS1.

Converts a hyphen to a dash under certain conditions. The hyphen must separate like items; unlike items are returned unmodified. These forms are modified:

  • letter - letter (A - B)
  • digit - digit (4-5)
  • digit separator digit - digit separator digit (4.1-4.5 or 4-1-4-5)
  • letterdigit - letterdigit (A1-A5) (an optional separator between letter and digit is supported – a.1-a.5 or a-1-a-5)
  • digitletter - digitletter (5a - 5d) (an optional separator between letter and digit is supported – 5.a-5.d or 5-a-5-d)

Any other forms are returned unmodified.

The input string may be a comma- or semicolon-separated list. Semicolons are converted to commas.

{{#invoke:String2|hyphen2dash|1=1-2}} returns 脚本错误:函数“hyphen2dash”不存在。.

{{#invoke:String2|hyphen2dash|1=1-2; 4–10}} returns 脚本错误:函数“hyphen2dash”不存在。.

Accept-this-as-written markup is supported, e.g. {{#invoke:String2|hyphen2dash|1=((1-2)); 4–10}} returns 脚本错误:函数“hyphen2dash”不存在。.

By default, a normal space is inserted after the separating comma in lists. An optional second parameter allows to change this to a different character (i.e. a thin space or hair space).

startswith

A startswith function similar to {{#invoke:string|endswith}}. Both parameters are required, although they can be blank. Leading and trailing whitespace is counted, use named parameters to avoid this if required. Outputs "yes" for true and blank for false so may be passed directly to #if.

Markup Renders as
{{#invoke:string2|startswith|search|se}}

脚本错误:函数“startswith”不存在。

{{#invoke:string2|startswith|search|ch}}

脚本错误:函数“startswith”不存在。

Usage

  • {{#invoke:String2 | sentence |…}} - Capitalizes the first character and shifts the rest to lowercase
    • Although similar to magic words' {{ucfirst:}} function, this call works even with piped wiki-links because it searches beyond leading brackets and other non-alphanumeric characters.
    • It now also recognises when it has an html list passed to it and capitalises the first alphabetic letter beyond the list item markup (li>>) and any piped links that may be there.
  • {{#invoke:String2 | ucfirst |…}} - Capitalizes the first alphabetic character and leaves the rest unaltered
    • Works with piped wiki-links and html lists
  • {{#invoke:String2 | title |…}} - Capitalizes all words, except for a, an, the, at, by, for, in, of, on, to, up, and, as, but, or, and nor.
  • {{#invoke:String2 | stripZeros |…}} - Removes leading padding zeros from the first number it finds in the string
  • {{#invoke:String2 | title |…}} - Renders the string as plain text without wikicode

Parameters

These functions take one unnamed parameter comprising (or invoking as a string) the text to be manipulated:

  • title
  • sentence
  • ucfirst

Examples

Input Output
{{#invoke:String2| ucfirst | abcd }} Abcd
{{#invoke:String2| ucfirst | abCD }} AbCD
{{#invoke:String2| ucfirst | ABcd }} ABcd
{{#invoke:String2| ucfirst | ABCD }} ABCD
{{#invoke:String2| ucfirst | 123abcd }} 123Abcd
{{#invoke:String2| ucfirst | }}
{{#invoke:String2| ucfirst | human X chromosome }} Human X chromosome
{{#invoke:String2 | ucfirst | {{#invoke:WikidataIB |getValue
| P136 |fetchwikidata=ALL |onlysourced=no |qid=Q1396889}} }}
Lua错误 在package.lua的第80行:module 'Module:i18n' not found
{{#invoke:String2 | ucfirst | {{#invoke:WikidataIB |getValue
| P106 |fetchwikidata=ALL |list=hlist |qid=Q453196}} }}
Lua错误 在package.lua的第80行:module 'Module:i18n' not found
 
{{#invoke:String2| sentence | abcd }} Abcd
{{#invoke:String2| sentence | abCD }} Abcd
{{#invoke:String2| sentence | ABcd }} Abcd
{{#invoke:String2| sentence | ABCD }} Abcd
{{#invoke:String2| sentence | [[action game]] }} Action game
{{#invoke:String2| sentence | [[trimix (breathing gas)|trimix]] }} Trimix
{{#invoke:String2| sentence | }}
 
{{#invoke:String2| title | abcd }} Abcd
{{#invoke:String2| title | abCD }} Abcd
{{#invoke:String2| title | ABcd }} Abcd
{{#invoke:String2| title | ABCD }} Abcd
{{#invoke:String2| title | }}
{{#invoke:String2| title | the vitamins are in my fresh california raisins}} The Vitamins Are in My Fresh California Raisins

String split

Template:String split is a convenience wrapper for the split function.

Modules may return strings with | as separators like this: {{#invoke:carousel | main | name = WPDogs | switchsecs = 5 }}脚本错误:没有“carousel”这个模块。

  • {{String split |{{#invoke:carousel | main | name = WPDogs | switchsecs = 5 }}|{{!}}| 2}}模板:String split

Lua patterns can allow splitting at classes of characters such as punctuation:

Or split on anything that isn't a letter (no is treated as false):

Named parameters force the trimming of leading and trailing spaces in the parameters and are generally clearer when used:

  • {{String split | txt=Apples pears oranges; Cats dogs | sep="%A+" | idx=3 | plain=false }}模板:String split

One2a

Template:One2a is a convenience wrapper for the one2a function.

Capitalisation is kept. Aimed for usage with {{Convert}}.

  • {{one2a |One foot. One mile. One kilometer. One inch.One amp. one foot. one mile. one inch. Alone at last. Onely the lonely. ONE ounce. One monkey.}}
模板:One2a
  • {{convert|1|ft|spell=on}} → one foot (zero point three zero metres)
  • {{one2a|{{convert|1|ft|spell=on}}}}模板:One2a
  • {{convert|2.54|cm|0|disp=out|spell=on}} → one inch
  • {{one2a|{{convert|2.54|cm|0|disp=out|spell=on}}}}模板:One2a

See also

Module:String for the following functions:

  • len
  • sub
  • sublength
  • match
  • pos
  • str_find
  • find
  • replace
  • rep

Templates and modules related to capitalization 模板:Case templates see also

Templates that implement nowiki>>


local p = {}


p.upper = function(frame)
	local s = mw.text.trim(frame.args[1] or "")
	return string.upper(s)
end

p.lower = function(frame)
	local s = mw.text.trim(frame.args[1] or "")
	return string.lower(s)
end


p.sentence = function (frame )
	frame.args[1] = string.lower(frame.args[1])
	return p.ucfirst(frame)
end


p.ucfirst = function (frame )
	local s =  mw.text.trim( frame.args[1] or "" )
	local s1 = ""
	-- if it's a list chop off and (store as s1) everything up to the first <li>
	local lipos = string.find(s, "<li>" )
	if lipos then
		s1 = string.sub(s, 1, lipos + 3)
		s = string.sub(s, lipos + 4)
	end
	-- s1 is either "" or the first part of the list markup, so we can continue
	-- and prepend s1 to the returned string
	if string.find(s, "^%[%[[^|]+|[^%]]+%]%]") then
		-- this is a piped wikilink, so we capitalise the text, not the pipe
		local b, c = string.find(s, "|%A*%a") -- find the first letter after the pipe
		return s1 .. string.sub(s, 1, c-1) .. string.upper(string.sub(s, c, c)) .. string.sub(s, c+1)
	end
	local letterpos = string.find(s, '%a')
	if letterpos then
		local first = string.sub(s, 1, letterpos - 1)
		local letter = string.sub(s, letterpos, letterpos)
		local rest = string.sub(s, letterpos + 1)
		return s1 .. first .. string.upper(letter) .. rest
	else
		return s1 .. s
	end
end


p.title = function (frame )
	-- http://grammar.yourdictionary.com/capitalization/rules-for-capitalization-in-titles.html
	-- recommended by The U.S. Government Printing Office Style Manual:
	-- "Capitalize all words in titles of publications and documents,
	-- except a, an, the, at, by, for, in, of, on, to, up, and, as, but, or, and nor."
	local alwayslower = {['a'] = 1, ['an'] = 1, ['the'] = 1, 
		['and'] = 1, ['but'] = 1, ['or'] = 1, ['for'] = 1,
		['nor'] = 1, ['on'] = 1, ['in'] = 1, ['at'] = 1, ['to'] = 1,
		['from'] = 1, ['by'] = 1, ['of'] = 1, ['up'] = 1 }
	local res = ''
	local s =  mw.text.trim( frame.args[1] or "" )
	local words = mw.text.split( s, " ")
	for i, s in ipairs(words) do
		s = string.lower( s )
		if( i > 1 and alwayslower[s] == 1) then
			-- leave in lowercase
		else
			s = mw.getContentLanguage():ucfirst(s)
		end
		words[i] = s
	end
	return table.concat(words, " ")
end


-- stripZeros finds the first number and strips leading zeros (apart from units)
-- e.g "0940" -> "940"; "Year: 0023" -> "Year: 23"; "00.12" -> "0.12"
p.stripZeros = function(frame)
	local s = mw.text.trim(frame.args[1] or "")
	n = tonumber( string.match( s, "%d+" ) ) or ""
	s = string.gsub( s, "%d+", n, 1 )
	return s
end


-- nowiki ensures that a string of text is treated by the MediaWiki software as just a string
-- it takes an unnamed parameter and trims whitespace, then removes any wikicode
p.nowiki = function(frame)
	local str = mw.text.trim(frame.args[1] or "")
	return mw.text.nowiki(str)
end


-- posnq (position, no quotes) returns the numerical start position of the first occurrence
-- of one piece of text ("match") inside another ("str").
-- It returns nil if no match is found, or if either parameter is blank.
-- It takes the text to be searched in as the first unnamed parameter, which is trimmed.
-- It takes the text to match as the second unnamed parameter, which is trimmed and
-- any double quotes " are stripped out.
p.posnq = function(frame)
	local str = mw.text.trim(frame.args[1] or "")
	local match = mw.text.trim(frame.args[2] or ""):gsub('"', '')
	if  str == "" or match == "" then return nil end
	-- just take the start position
	local pos = str:find(match, 1, true)
	return pos
end


return p